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TTMTOTTH ASSOrTATRD KAPO^T ' S SARCO^ VTRUS SEQUENCES AND 
TTSES THEREOF 



10 The invention disclosed herein was made with 

Government support under a co-operative agreement 
CCU210852 from the Centers for Disease Control and 
Prevention, of the Department of Health and Human 
Servaces. Accordingly, the U.S. Government has 
certain rights in this invention. 
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This application is a continuation-in-part application 
of U.S. Serial No. 08/420,235, filed on April 11, 1995 
which is a continuation-in-part application of U.S. 
Serial No. 08/343,101, filed on November 21, 1994, 
which is hereby incorporated by reference. 

Throughout this application, various publications may 
be referenced by Arabic numerals in brackets. Full 
citations for these publications may be found at the 
end cf each Experimental Details Secticr.. The 
disclosures. of 



* -he publications cited herein are in 
their entirety hereby incorporated by reference into 
this application to more fully describe the state of 
30 the art to which this invention pertains. 

BACKGROUND OF TH E INVENTION 

Kaposi's sarcoma (KS) is the most common, neoplasm 
occurring in persons with acquired immunodeficiency 

35 syndrome (AIDS) . Approximately 15-20% or AIDS 

patients develop this neoplasm which rare.y occurs m 
immunocompetent individuals [1^, i4 ' • -P-ce ~ " 

evidence suggests that AIDS - associated K= {AIDS-KS; 
has an infectious etiology. Gay ar.c bisexual AIDS 

40 oatients are approximately twenty times mere likely 
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than hemophiliac AIDS patients to develop KS , and KS 
may be associated with specific sexual practices among 
gay men with AIDS [6, 15, 55, 83] . KS is uncommon 
among adult AIDS patients infected through 
5 heterosexual or parenteral HIV transmission, cr among 

pediatric AIDS patients infected through vertical KIV 
transmission [77] . Agents previously suspected cf 
causing KS include cytomegalovirus, hepatitis B virus, 
human papillomavirus, Epstein-Barr virus, human 

10 herpesvirus 6, human immunodeficiency virus (HIV) , and 

Mycoplasma penetrans [18, 23, 85, 91, 92] . Non- 
infectious environmental agents, such as nitrite 
inhalants, also have been proposed to play a role in 
KS tumorigenesis [33] . Extensive investigations, 

15 however, have not demonstrated an etiologic 

association between any of these agents and AIDS-KS 
[37, 44 , 46, 90] . 
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ST3MMARY OF THE INVENTION 

This invention provides an isolated DNA molecule which 
is at least 3 0 nucleotides in length and which 
5 uniquely defines a herpesvirus associated with 

Kaposi's sarcoma. This invention provides an isolated 
herpesvirus associated with Kaposi's sarcoma. 

This invention provides a method of vaccinating a 
10 subject for KS , prophylaxis diagnosing or treating a 

subject with KS and detecting expression of a DNA 
virus associated with Kaposi's sarcoma in a cell. 
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BRIEF DESCRIPTION OF THE FlGtTREfi 
Ficrure 1 : 

5 Agarose gel electrophoresis of RDA produces from 

AIDS-KS tissue and uninvoived tissue. ■ RDA was 
performed on DNA extracted from KS skin tissue 
and uninvoived normal skin tissue obtained at 
autopsy from a homosexual man with AIDS-KS. Lane 

10 1 shows the initial PCR amplified genomic 

representation of the AIDS-KS DNA after Bam HI 
digestion. Lanes 2-4 show that subsequent cycles 
of ligation, amplification, hybridization and 
digestion of the RDA products resulted in 

15 amplification of discrete bands at 380, 450, 540 

and 680 bp. RDA of the extracted AIDS-KS DNA 
performed against itself resulted in a single 
band at 54 0 bp Clane 5) . Bands at 3 80 bp and 680 
bp correspond to KS330Bam and KS6273am 

20 respectively after removal of 28 bp priming 

sequences. Bands at 450 and 540 bp hybridized 
nonspecif ically to both KS and non-KS human DNA. 
Lane M is a molecular weight marker. 

25 Figures 2A-2B: 

Hybridization of 32 P- labelled KS330Bam (Figure 2A) 
and KS6273am (Figure 23) sequences to a 
representative panel of 19 DNA samples extracted 
from KS lesions and digested with Ba.m HI . 

3 0 KS330Bam hybridized to 11 of the IS and KS6273am 

hybridized to 12 of the 19 DNA samples from AIDS- 
KS lesions. Two additional cases (lanes 12 and 
13) were shown to have faint bands with both 
KS3 3 OBam and KS627Bam probes after longer 

35 exposure. One negative specimen (lane 3} did not 

have microscopically detectable KS in the tissue 
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specimen. Seven of 8 additional KS DNA samp.es 
also hybridized to both sequences. 



10 



15 



20 



25 



Figures 3A-3F: 

Nucleotide sequences of the DNA herpesvirus 

associated with KS (KSHV) . 

Figures 4A-4B: 

PCR amplification of a representative set of KS- 
derived DNA samples using KS330 23 , primers. 
Figure 4A shows the agarose gel of the 
amplification products from 19 KS DNA samples 
(lanes 1-19) and Figure 4B shows specific 
hybridization of the PCR products to a 32 ? end- 
labelled 25 bp internal oligonucleotide (Figure 
3B) after transfer of the gel to a nitrocellulose 
filter. Negative samples in lanes 3 and 15 
respectively lacked microscopically detectable KS 
in the sample or did not amplify the constitutive 
p53 exon 6, suggesting that these samples were 
~ negacive f or technical reasons. An additional 8 
AIDS-KS samples were amplified and all were 
positive for KS330 a „. Lane 20 is a negative 
control and Lane M is a molecular weight marker. 



Figure 5 : 

Southern bio. hybridization of KS 2 3 03am ^and 
KS627Bam to AIDS-KS genomic DNA extracted from 
three subjects (lanes 1. 2, and 3 ) and digested 
30 with PvuII. Based on sequence information 

(Fiaure 3A) . restricted sites for Pvu -I occur 
between bo 12351-12362 of the KSHV sequence 

(Fiaure 3A, SEQ ID NO: 1). at bp 124 ir. KS3 3 03aT. 

(Fiaure 3B, SEQ ID NO: 2! and bp 414 m KS€2 7Bam 
35 (Figure 3C, SEQ ID NO: 3;. KS330Bam and KSS2 7Bair. 

failed to hybridize to the same fragments in tne 

digests indicating 



hat the two sequences are 
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separated from each other by one cr more 
intervening Bam HI restriction fragments. 
Digestion with Pvu II anc ; hybridization to 
KS33 0Bam resulted in two distinct banding 
patterns {lanes 1 and 2 vs. lane 3) suggesting 
variation between KS samples. 

Figure 6 : 

Comparison of amino acid homologies between EBV 
ORF BDLF1 , . HSVSA ORF 2 6 and a 916 bp reading 
frame of the Kaposi's sarcoma agent which 
includes KS330Bam. Amino acid identity is 
denoted by reverse lettering. In HSVSA, ORF 2 6 
encodes a minor capsid VP23 which is a late gene 
15 product. 

Figure 7 : 

Subculture of Raji cells co-cultivated with BCBL- 
1 cells treated with TPA for 2 days. PCR shows 
20 that Raji cells are positive for KSHV sequences 

and indicate that the agent is a transmissible 
virus . 

Figure 8 : 

25 A schematic diagram of the orientation cf KSHV 

open reading frames identified on the KS5 20,710 
bp DNA fragment . Homologs to each open reading 
frame from a corresponding region of the 
herpesvirus saimiri (HSVSA) genome are present in 

30 an identical orientation, except for the reaion 

corresponding to the ORF 2B of HSVSA (Twiddle 
schematic section) . The shading f cr each open 
reading frame corresponds to the approximate % 
amino acid identity for the KSHV ORF compared to 

3 5 this homo log in HSVSA. Noteworthy homologs that 

are present in this section of DNA include 
homologs to thymidine kinase ( ORF2 1 ) , qH 
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glycoprotein (ORF22), major capsid protein 
(ORF25) and the VP23 protein (ORF26) whicr. 
contains the original KS330Bam sequence derived 
by representational difference analysis. 

Figure 9: 

The -200 kD antigen band appearing on a Western 
blot of KS patient sera against BC3L1 lysate <B1) 
and Raji lysate <RA) . M is molecular weight 
marker. The antigen is a doublet between ca. 210 
. kD and 240 kD. 

Figure 10: 

5 control patient sera without KS CAIN , A2N, A3N, 
A4N and A5N) . B1=BCBL1 lysate, RA=Raji lysate. 
The 220 kD band is absent from the Western blots 
using patient sera without KS . 

Figure 11: 

in this figure, 0 . 5 mi aliquots of the gradient 
have been fractionated (fractions 1-62) with the 
30% gradient fraction being at fraction No. 1 and 
the 10% gradient fraction being at fraction No. 
62. Each fraction has been dot hybridized to a 
nitrocellulose membrane and then a labeled 
KSHV'DNA fragment, KS6313am has been hybridized 
to the membrane using standard techniques. The 
figure shows that the major solubilized fraction 
of the KSKV genome bands (i.e. is isoiatea) m 
fractions 42 through 48 of tne gradient with a 
hiah concentration of the genome being present m 
fraction 44. A second band cf solubilized KSKV 
DNA occurs in fractions 26 tr.rcugr. ^2 . 



3 5 Figure 12: 

Location, feature, and relative homologies of KS, 
ooen reading frames compared to translation 



WO 96/15779 



PCT/US95/15I38 



products of herpesvirus saimiri (HSV) , equine 
herpesvirus 2 { EHV2 ) and Epstein -3arr virus 
(EBV) . 



5 Figure 13 : 

Indirect immunofluorescence end-point and 
geometric mean titers (GMT) in AIDS-KS and AIDS 
control sera against BHL-6 and P3K3 prior to and 
after adsorption with P3H3 . 

10 

Figure 14 : 

Genetic map of KS5, a 20.7 kb lambda phage clone 
insert derived from a human genomic library 
prepared from an AIDS-KS lesion. Seventeen 

15 partial and complete open reading frames (ORFs) 

are identified with arrows denoting reading frame 
orientations. Comparable regions of the Epstein - 
Barr virus (EBV) and herpesvirus saimiri (HVS) 
genomes are shown for comparison. Levels of 

20 amino acid similarity between KSHV ORFs are 

indicated by shading of E3V and HVS ORFs (black, 
over 70% similarity; dark gray, 55-70% 
similarity; light gray, 40-54% similarity; white, 
no detectable homology) . Domains of conserved 

25 herpesvirus sequence blocks and locations of 

restriction endonuclease sites used in subcloning 
are shown beneath the KSHV map (3, Bam HI site; 
N, Not I site) . The small Bam HI fragment 
(black) in the VP23 gene homolog corresponds no 

30 the KS3 3 OBam fragment generated by 

representational difference analysis which was 
used to identify the KS5 lambda phage clone. 

Figures 15A-15B: 
3 5 Phylogenetic trees cf KSHV based on comparison of 

aligned amine acid sequences between 
herpesviruses for the MC? gene and for a 



WO 96/15779 



10 



15 



concatenated nine-gene set. The comparison cr 
MCP sequences (Figure ISA) was obtained by tne 
neighbor- joining method and is shown in unrooted 
form with branch lengths proportional to 
divergence (mean number of substitution events 
per site) between the nodes bounding each branch . 
Comparable results were obtained by maximum 
parsimony analysis. The number of times out of 
100 bootstrap samplings the division indicated by 
each internal branch was obtained are shown next 
to each branch; bootstrap values below 7 5 are not 
shown. Figure 15B is a phylogenetic tree of 
gammaherpes virus sequences based on a nine -gene 
set CS1 (see text) and demonstrates that KSHV is 
most closely related to the gamma-2 herpesvirus 
sublineage, genus Rhadlnovirus . The CS1 amino 
acid sequence was used to infer a tree by the 
Protml maximum likelihood method; comparable 
results, not shown were obtained with the 
neighbor- joining and maximum parsimony methods. 
The bootstrap value for the central branch is 
marked. On the basis of the MC? analysis, the 
root must lie between E3V and the other three 
species. Abbreviations for virus species used in 
25 t he sequence comparisons are 1) 

Alphaherpesvirinae: KSV1 and KSV2 , herpes 

simplex virus types 1 and 2; equine 
herpesvirus 1; PRV , pseudorabies virus; and VZV, 
varicella- zoster virus, 2) Betaherpesvirinae : 
HCMV, human cytomegalovirus; KKV5 and KKV7 , numan 
herpesviruses 6 and 7, and 2 ) Gammaherpesvirinae : 
HVS, herpesvirus saimiri; ErTV 2 , equine 
herpesvirus 2; EBV , £?s:e:n-3arr virus; ana 
Kaposi's sarcoma-associated herpesvirus. 

35 



20 



30 
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Figures 16A-lfiR: 

CHEF gel electrophoresis of BCBL-1 DNA hybridized 
to KS631Bam (Figure 16A) and EBV terminal repeat 
(Figure 16B) . KS631Bam hybridizes tc a band at 
5 270 kb as well as to a diffuse band at the 

origin- The EBV termini sequence hybridizes to 
a 150-160 kb band consistent with the linear form 
of the genome. Both KS631Bam (dark arrow) and an 
EBV terminal sequence hybridize to high molecular 
10 weight bands immediately below the origin 

indicating possible concatemeric or circular DNA. 
The high molecular weight KS631Bam hybridizing 
band reproduces poorly but is visible on the 
original autoradiographs . 

15 

Fiqrure 17 : 

Induction of KSHV and EBV replication in BCBL-1 
with increasing concentrations of TPA. Each 
determination was made in triplicate after 48 h 

2 0 of TPA incubation and hybridization was 

standardized to the amount of cellular DNA by 
hybridization to beta-actin. The figure shows 
the mean and range of relative increase in 
hybridizing genome for EBV and KSHV induced by 

25 TPA compared to uninduced BC3L- 1 . TPA at 20 

ng/ml induced an eight -fold increase in EBV 
genome (upper line) at 4 8 h compared to only a 
1.4 fold increase in KSHV genome (lower line) . 
Despite the lower level of KSHV induction, 

30 increased replication of KSHV genome after 

induction with TPA concentrations over 10 ng/ml 
was reproducibiy detected. 

Figures 1BA-18C: 
35 In situ hybridization wi:h an ORF26 oligomer to 

3C3L-1, Rail and RCC-l cells. Hybridization 
occurred to nuclei of KSHV infected 3C3L-1 



WO 96/15779 



11 



(Figure 18A) , bur not to uninfected Raji cens 
(Figure 18B) . RCC-I, a Raji cell line derived by- 
cultivation of' Raji with BC5L-1 in communicating 
chambers separated by a 0.45 M filter, shows rare 
5 cells with positive hybridization to the KSKV 

ORF26 probe (Figure 18C) . 

Figures 19A-19D: 

Representative example of I FA staining of 3HL-6 
10 with AIDS-KS patient sera and control sera from 

HIV-infected patients without KS . Both AIDS-KS 
(Figure 19A) and control (Figure 19B) sera show 
homogeneous staining of BHL-6 at 1:50 dilution. 
After adsorption with paraformaldehyde - f ixed P3H3 
15 no remove cross- reacting antibodies directed 

against lymphocyte and EBV antigens, antibodies 
from AIDS-KS sera localize to BHL-6 nuclei 
(Fiaure 19C) . P3H3 adsorption of control sera 
eliminates immunof luorescent staining cf BHL-6 
20 (Figure 19D) . 

Figures 20A-20B: 

Longitudinal PCR examination for KSKV DNA of 
paired P3MC samples from AIDS-KS patients (A) and 
25 homosexual /bisexual AIDS patients without KS £B) 

Time 0 is the date of KS onset for cases or ether 
AIDS-defining illness for controls. All samples 
were randomized and examined blindly. Overall, 
7 of the KS patients were KSKV positive at both 
30 examination dates (solid bars) and 5 converted 

from a negative to positive ?3KC sample (forward 
striped bars) immediately prior to or after KS 
onset. Two previously positive KS patients were 
negative after KS diagnosis (reverse striped 
35 bars) and the remaining KS patients were negative 

at both timepomts (cpen bars; . Two 
homosexual /bisexual control ?3MC samples without 
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KS converted from negative to positive and one 
control patient reverted from PCR positive to 
negative for KSHV DNA. 

5 Figure 21 : 

Sample collection characteristics for AIDS-KS 
patients, gay/bisexual AIDS patients and 
hemophilic AIDS patients. 



10 Figure 22 : 

PCR analysis of KS330 233 in DNA samples from 
patients with Kaposi's sarcoma and tumor 
controls . 
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DETAILED DEg^TPTTON OF THE INVKNTION 

Def init ions 

5 

The following standard abbreviations are usee 
throughout the specification to indicate specific 
nucleotides : 

10 Ocytosine A=adenosine 

T=thymidine G=guanosine 

The term "nucleic acids", as used herein, refers to 
either DNA or RNA . "Nucleic acid sequence" or 
15 "polynucleotide sequence" refers to a single- or 

double -stranded polymer of deoxyribonucleot ide or 
ribonucleotide bases read from the 5' to the 3 ' end. 
It includes both self -replicating plasmids, infectious 
polymers of DNA or RNA and nonfunctional DNA or RNA. 

20 

By a nucleic acid sequence "homologous to" or 
"complementary to", it is meant a nucleic acid that 
selectively hybridizes, duplexes or binds to viral DNA 
sequences encoding proteins or portions thereof when 

25 the DNA sequences encoding the viral protein are 

present in a human genomic or cDNA library. A DNA 
sequence which is homologous to a target sequence can 
include sequences which are shorter or longer than the 
target sequence so long as they meet the functional 

30 test set" forth. Hybridization conditions are 

specified along with the source of the CDNA library. 



Southern 



Typically, the hybridization is cone m a 
blot protocol using a 0.2XSSC, 0.1% SDS , 65°C wash. 
The term "SSC" refers to a citrate-saline solution of 
0.15 M sodium chloride and 20 Mm sodium citrate. 
Solutions are often expressed as multiples or 
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fractions of this concencrarion. For example, 6XSSZ 
refers to a solution having a sodium chloride and 
sodium citrate concentration of 6 times this amount or 
0.9 M sodium chloride and 120 mM sodium citrate. 
5 0.2XSSC refers to a solution 0.2 times the SSC 

concentration or 0.03 M sodium chloride and 4 mM 
sodium citrate. 

The phrase "nucleic acid molecule encoding" refers to 
10 a nucleic acid molecule which directs the expression 

of a specific protein or peptide. The nucleic acid 
sequences include both the DNA strand sequence that is 
transcribed into RNA and the RNA sequence that is 
translated into protein. The nucleic acid molecule 
15 include both the full length nucleic acid sequences as 

well as non-full length sequences derived from the 
full length protein. It being further understood that 
the sequence includes the degenerate codons of the 
native sequence or sequences which may be introduced 
20 to provide codon preference in a specific host cell. 

The phrase "expression cassette", refers to nucleotide 
sequences which are capable cf affecting expression of 
a structural gene in hosts compatible with such 
25 sequences. Such cassettes include at least promoters 

and optionally, transcription termination signals. 
Additional factors necessary or helpful in effecting 
expression may also be used as described herein. 

30 The term "operably linked" as used herein refers to 

linkage of a promoter upstream from a DNA sequence 
such that the promoter mediates transcription of the 
DNA sequence . 

35 The term "vector", refers to viral expression systems, 

autonomous self -replicating circular DNA (plasmids } , 
and includes both expression and ncnexpression 
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piasmids. Where a recombinant microorganism cr cell 
culture is described as hosting an "expression 
vector," this includes both extrachromosomal circular 
DNA and DNA that has been incorporated into the host 
chromosome (s) . Where a vector is being maintained by 
a host cell, the vector may either be stably 
replicated by the cells during mitosis as an 
autonomous structure, or is incorporated within the 
host ' s genome . 



The term "plasmid" refers to an autonomous circular 
DNA molecule capable of replication in a cell, and 
includes both the expression and nonexpression types. 
Where a recombinant microorganism or cell culture is 
15 described as hosting an "expression plasmid", this 

includes latent viral DNA integrated into the host 
chromosome (s) . Where a plasmid is being maintained by 
a host cell, the plasmid is either being stably 
replicated by the cells during mitosis as an 
20 autonomous structure or is incorporated within the 

host ' s genome - 

The phrase "recombinant protein" cr » recombinant ly 
produced protein" refers to a peptide cr protein 

25 produced using non-native cells that do not have an 

endogenous copy of DNA able to express the protein. 
The cells produce the protein because they have been 
genetically altered by the introduction of the 
appropriate nucleic acid sequence. The recombinant 

30 protein will not be found in association with proteins 

and other subcellular components normally associated 
with the cells producing the protein. 

The following terms are used to describe the sequence 
25 relationships between two or more nucleic acid 

molecules or polynucleotides: "reference sequence", 
"comoarison window", "sequence identity", "percentage 
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cf sequence identity", and "substantial identity". A 
"reference sequence" is a defined sequence used as a 
basis for a sequence comparison; a reference sequence 
may be a subset of a larger sequence, for example, as 
5 a segment of a full-length cDNA or gene sequence given 

in a sequence listing or may comprise a complete cDNA 
or gene sequence . 

Optimal alignment of sequences for aligning a 
10 comparison window may be conducted by the local 

homology algorithm of Smith and Waterman £1981} Adv. 
Appl . Math, 2:482, by the homology alignment algorithm 
of Needleman and Wunsch (1970) J". Mol . Biol. 48:443, 
by the search for similarity method of Pearson and 
15 Lipman (1988) Proc . Natl. Acad. Sci . (USA) 85:2444, or 

by computerized implementations of these algorithms 
(GAP, BESTFIT, FAST A , and TFASTA in the Wisconsin 
Genetics Software Package Release 7.0, Genetics 
Computer Group, 575 Science Dr., Madison, WI) . 

20 

As applied to polypeptides, the terms "substantial 
identity" or "substantial sequence identity" mean that 
two peptide sequences, when optimally aligned, such as 
by the programs GAP or BESTFIT using default gap which 
25 share at least 90 percent sequence identity, 

preferably at least 95 percent sequence identity, more 
preferably at least 99 percent sequence identity or 
more . 

3 0 "Percentage amino acid identity" or " percentage amino 

acid sequence identity" refers to a comparison of the 
amino acids of two polypeptides which, when optimally 
aligned, have approximately the designated percentage 
of the same amino acids. For example, "95% amino acid 

3 5 identity" refers to a comparison cf the amino acids cf 

two polypeptides which when optimally aligned have 95% 
amino acid identity. Preferably, residue positions 
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which are not iden-.ical differ by conservative amine 
acid substitutions. For example, the substitution or 
amino acids having similar chemical properties sue.-, as 
charae or polarity are not likely to effect the 
5 properties of a protein. Examples include giutamme 



10 



15 



20 



30 



for asparagine or glutamic acid for asparti 



acii 



means 
of 
a 



The phrase "substantially purified" or "isolated" when 
referring to a herpesvirus peptide or protein, 
a chemical composition which is essentially free 
other cellular components. It is preferably in 
homogeneous state although it can be in either a dry- 
er aqueous solution. Purity and homogeneity are 
cyoicallv determined using analytical chemistry 
techniques such as polyacrylamide gel electrophoresis 
or high performance liquid chromatography. A protein 
which is the predominant species present in a 
^reparation is substantially purified. Generally, a 
substantially purified or isolated protein will 
comprise more than 80% of all macromoiecular species 
present in the preparation. Preferably, the protein 
is purified to represent greater than 90% of all 
macromoiecular species present. More preferably the 
protein is purified to greater than 55%, and most 
25 preferably the protein is purified to essential 

homogeneity, wherein other macromoiecular species are 
not detected by conventional techniques. 



The phrase "specifically binds to an antibody" or 
"specifically immunoreactive with", when referring to 
a oroteir. or peptide, refers to a binding reaction, 
which is determinative of the presence of tr.e 
herpesvirus of the invention in the presence of • 
heterogeneous population of proteins anc otner 
biologies including viruses other than tr.e 
herpesvirus. Thus, under designated immunoassay 
conditions, the specified antibodies oir.c 
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herpesvirus antigens and do not bind in a significant 
amount to other antigens present in the sample . 
Specific binding to an antibody under such conditions 
may require an antibody that is selected for its 
5 specificity for a particular protein. For example, 

antibodies raised to the human herpesvirus immunogen 
described herein can be selected to obtain antibodies 
specifically immunoreactive with the herpesvirus 
proteins and not with other proteins. These 

10 antibodies recognize proteins homologous to the human 

herpesvirus protein. A variety of immunoassay formats 
may be used to select antibodies specifically 
immunoreactive with a particular protein. For 
example, solid-phase ELISA immunoassays are routinely 

15 used to select monoclonal antibodies specifically 

immunoreactive with a protein. See Harlow and Lane 
[32] for a description of immunoassay formats and 
conditions that can be used to determine specific 
immunoreact ivity . 



20 



25 



"Biological sample" as used herein refers to any 
sample obtained from a living organism or from an 
organism that has died. Examples of biological 
samples include body fluids and tissue specimens. 

I . Kaposis' s Sarcoma (KS^ - Associated Herpesvirus. 



This invention provides an isolated DNA molecule which 
is at least 3 0 nucleotides in length and which 
30 uniquely defines a herpesvirus associated with 

Kaposi's sarcoma. 

In one embodiment the isolated DNA molecule comprises 
at least a portion of the nucleic acid sequence as 
3 5 shown in Figure 3 A (SEQ ID NO: 1) . I r. another 

embodiment the isolated DNA molecule is a 330 base 
pair (bp) sequence. In another embodiment the 
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isolated DNA molecule is a 12-50 bp sequence. In 
another embodiment the isolated DNA molecule is a 30- 
3 7 bp sequence. 

5 in another embodiment the isolated DNA molecule is 

genomic DNA . In another embodiment the isolated DNA 
molecule is cDNA. In another embodiment a RNA is 
derived form the isolated nucleic acid molecule or is 
capable of hybridizing with the isolated DNA molecule. 
10 As used herein "genomic" means both coding and non- 

coding regions of the isolated nucleic acid molecule. 

Further, the DNA molecule above may be associated with 
lymphoproliferative diseases including, but not 
limited to: Hodgkin's disease, non-Hodgkin ' s lymphoma, 
lymphatic leukemia , lymphosarcoma , splenomegaly , 
reticular cell sarcoma, Sezary's syndrome, mycosis 
fungoides, central nervous system lymphoma, AIDS 
related central nervous system lymphoma, post- 
transplant lymphoproliferative disorders, • and 
Burkitt's lymphoma. A lymphoproliferative disorder is 
characterized as being the uncontrolled clonal or 
polyclonal expansion of lymphocytes involving lymph 
nodes, lymphoid tissue and other organs. 

This invention provides • an isolated nucleic acid 
molecule encoding an ORF20 (SEQ ID NOs : 22 and 23), 
ORF21 (SEQ ID NOs : 14 and 15}, ORF22 (SEQ ID NOs : 16 and 
17), ORF23 (SEQ ID NOs : 1 6 and 19), ORF24 (SEQ ID NOs : 
20 and 21), ORF25 (SEQ ID NOs : 2 and 3), ORF26 (SEQ^ID 
NOs:24 and 25), ORF27 (SEQ ID NOs: 26 and 27), ORr26 
(SEQ ID NOS:26 and 2S), ORF2SA (SEQ ID NOs : 3 0 and 31), 
ORF253 (SEQ ID NOs : 4 and 5), ORF30 (SEQ ID NOs and 
7), ORF31 (SEQ ID NOs : 8 and 9), ORF32 (Sr.Q -D NOs:32 
35 and 33), ORF33 (SEQ ID NOs: 10 and 11: , ORF34 (SEQ ID 

NOs: 34 and 35), or ORF35 (SEQ ID NOs : 12 AND 13). 
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This invention provides an isolated polypeptide 
encoded by ORF20 { SEQ ID NOs : 22 and 23), ORF21 (SEQ 
ID NOs:14 and 15), ORF22 (SEQ ID NOs : 16 and 17), ORF23 
(SEQ ID NOs:18 and 19), ORF24 (SEQ ID NOs : 20 and 21) , 
5 ORF25 {SEQ ID NOs : 2 and 3), ORF2 6 (SEQ ID NOs : 24 and 

25), ORF27 (SEQ ID NOs:26 and 27), ORF28 (SEQ ID 
NOs:28 and 29), ORF29A (SEQ ID NOs : 3 0 and 31), ORF29B 
(SEQ ID NOs:4 and 5), ORF30 (SEQ ID NOs : 5 and 7), 
ORF31 (SEQ ID NOs : 8 and 9), ORF32 (SEQ ID NOs:32 and 
10 33), ORF33 (SEQ ID NOs : 10 and 11), ORF34 (SEQ ID NOs: 

34 and 35), or ORF35 (SEQ ID NOs : 12 AND 13). 

For Example, TK is encoded by ORF 21; glycoprotein H 
(gH) by ORF 22; major capsid protein (MCP) by ORF 25; 
15 virion polypeptide (VP23) by ORF 26; and minor capsid 

protein by ORF 27. 

This invention provides for a replicable vector 
comprising the isolated DNA molecule of the DNA virus. 
20 The vector includes, but is not limited to : a plasmid, 

cosmid, X phage or yeast artificial chromosome (YAC) 
which contains at least a portion of the isolated 
nucleic acid molecule . 

25 As an example to obtain these vectors, insert and 

vector DNA can both be exposed to a restriction enzyme 
to create complementary ends on both molecules which 
base pair with each other and are then ligated 
together with DNA ligase. Alternatively, linkers can 

3 0 be ligated to the insert DNA which correspond to a 

restriction site in the vector DNA, which is then 
digested with the restriction enzyme which cuts at 
that site. Other means are also available and known 
to an ordinary skilled practitioner. 

35 

Regulatory elements required for expression include 
urometer or enhancer seauences to bind RNA eclvmerase 
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and transcription initiation sequences fcr ribosome 
binding- For example, a bacterial expression vector 
includes a promoter such as the lac promoter and fcr 
transcription initiation the Shine-Daigarno sequence 
5 and the start codon AUG. Similarly, a eukaryctic 

expression vector includes a . heterologous or 
homologous promoter for RNA polymerase II, a 
downstream polyadenylat ion signal, the start codon 
AUG, and a termination codon for detachment of the 
10 ribosome. Such vectors may be obtained commercially 

or assembled from the sequences described by methods 
well-known in the art, for example the methods 
described above for constructing vectors in general. 

15 This invention provides a host cell containing the 

above vector. The host cell may contain the isolated 
DNA molecule artificially introduced into the host 
cell. The host cell may be a eukaryotic or bacterial 
cell (such as E.coli ) , yeast cells, fungal cells, 
insect cells and animal cells. Suitable animal cells 
include, but are not limited to Vero cells, HeLa 
cells, Cos cells, CV1 cells and various primary 
mammalian cells. 

25 This invention provides an isolated herpesvirus 

associated with Kaposi's sarcoma. In one embodiment 
the herpesvirus comprises at least a portion of a 
nucleotide sequence as shown in Figures 3A { SEQ ID NO: 
1) . 
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In one embodiment the herpesvirus may be a DNA virus. 

In another embodiment the herpesvirus .may be a 

Herpesviridae. In another embodiment the herpesvirus 

may be a gammaherpesvirmae . The classification of 

the herpesvirus may vary based on the phenctypic or 

- a> -a ,-«- rs w^ich are known to tnose 

mciecuiar cnarac.e.-Sw.cs w..-<-.. 

skilled in the art. 
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This invention provides an isolated DNA virus wherein 
the viral DNA is about 270 kb in size, wherein the 
viral DNA encodes a thymidine kinase, and wherein the 
viral DNA is capable of selectively hybridizing to a 
5 nucleic acid probe selected from the group consisting 

Of SEQ ID NOS : 38-40 . 



The KS-associated human herpesvirus of the invention 
is associated with KS and is involved in the etiology 

10 of the disease. The taxonomic classification of the 

virus, has not yet been made and will be based on 
phenotypic or molecular characteristics known to those 
of skill in the art. However, the novel KS-associated 
virus is" a DNA virus that appears to be related to the 

15 Herpesviridae family and the gammaherpesvirinae 

subfamily, on the basis of nucleic acid homology. 

A . Sequence identity of the viral DNA and its 
proteins . 

20 

The human herpesvirus of the invention is not limited 
to the virus having the specific DNA sequences 
described herein. The KS-associated human 

herpesvirus DNA shows substantial sequence identity, 
2 5 as defined above, to the viral DNA sequences described 

herein. DNA from the human herpesvirus typically 
selectively hybridizes to one or more cf the following 
three nucleic acid Drobes : 



3 0 Probe 1 { SEQ ID NO: 38) 

AGCCGAAAGG ATTCCACCAT TGTGCTCGAA TCCAACGGAT TTGACCCCGT 
GTTCCCCATG GTGGTGGCGC AG CAA CTGGG GCACGCTA7T C7GCAGCAGC 
TGTTGGTGTA CCACATGTAC TCCAAAATAT CGGCCGGGGC CCCGGATGAT 
GTAAATATGG CGGAACTTGA T CT AT AT A C C ACCAATGTGT CAT77ATGGG 

3 5 G CG C A CAT AT CGTCTGGACG TAGACAACAC GGA 
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Probe 2 (SEQ ID NO : 3 9 ) : 

GAAATTACCC ACGAGATCGC TTCCCTGCAC ACCGCACT7G GC~AC?Z~T^ 
AGTCATCGCC CCGGCCCACG TGGCCGCCAT AACTACAGAC ATGGGAGTAC 
ATTGT CAGGA CCTCTTTATG ATTTTCCCAG GGGACGCGTA TCAGGACCGC 
CAGCTGCATG ACT AT AT CAA AATGAAAGCG GGCGTGCAAA CCGGCTZACC 
GGGAAACAGA ATGGAT CACG TGGGATACAC TGCTGGGGTT CCTCGCTGCG 
AGAACCTGCC CGGTTTGAGT CATGGTCAGC TGGCAACCTG CGAGATAATT 
CCCACGCCGG TCACATCTGA CGTTGCCT 



Probe 3 (SEQ ID NO: 40) : 

AACACGTCAT GTGCAGGAGT GACATTGTGC CGCGG AG AAA CTCAGACCGC 
ATCCCGTAAC CACACTGAGT GGGAAAATCT GCTGGCTATG TTTTCTGTGA 
TTATCTATGC CTTAGATCAC AACTGTCACC CG 

Hybridization of a viral DNA to the nucleic acid 
probes listed above is determined by using standard 
nucleic acid hybridization techniques as described 
he — in. In particular, PCR amplification of a viral 
genome can be carried out using the following three 
sets of PCR primers: - 

1 ) AG CCGAAAGG ATT CCACC AT ; 

TCCGTGTTGTCTACGTCCAG { SEQ ID NO: 41) 



2 } GAAATTACCC ACGAGATCGC ; 

AGGCAACGTCAGATGTGA (SEQ ID NO: 42) 

3 ) AACACGTCATGTGCAGGAGTGAC ; 

CGGGTGACAGTTGTGATCTAAGG ! SEQ ID NO: 43 



In PCR techniques, oligonucleotide primers, as listed 
above, complementary to the two 3' borders of the DNA 
reqion to be 



amplified are synthesizes. 
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polymerase chain reaction is then carried out using 
the two primers. See PCR Protocols : A Guide to 
Methods and Applications [74] . Following PCK 

amplification, the PCR-amplif ied regions of a viral 
5 DNA can be tested for their ability to hybridize to 

the three specific nucleic acid probes listed above. 
Alternatively , hybridization of a viral DNA to the 
above nucleic acid probes can be performed by a 
Southern blot procedure without viral DNA 
10 amplification and under stringent hybridization 

conditions as described herein. 



Oligonucleotides for use as probes or PCR primers are 
chemically synthesized according to the solid phase 

15 phosphoramidite triester method first described by 

Beaucage and Carruthers [19] using an automated 
synthesizer, as described in Needham- VanDevanter [6 9] . 
Purification of oligonucleotides is by either native 
acrylamide gel electrophoresis or by anion-exchange 

2 0 KPLC as described in Pearson, J.D. and Regnier, F.E. 

[75A] . The sequence of the synthetic oligonucleotide 
can be verified using the chemical degradation method 
of Maxam, A.M. and Gilbert, W. [63]. 

25 B . Isolation and propagation cf KS- inducing 

strains cf the Human Herpesvirus 

Using conventional methods, the human herpesvirus can 
be propagated in v^ro. For example, standard 

30 techniques for growing herpes viruses are described in 

Ablashi, D.V. [1] . Briefly, PKA stimulated cord blood 
mononuclear cells, macrophage, neuronal, cr glial ceil 
lines are cocultivated with cerebrospinal fluid, 
plasma, peripheral blood leukocytes, cr tissue 

35 extracts containing viral infected ceils cr purified 

virus. The recipient cells are treated with 5 ug/ml 
ooivbrene for 2 hours at 37° C rricr to infection. 
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infected cells are observed by demor.strat ir.g 
morphological changes, as well as being positive rcr 
antigens from the human herpesvirus by using 
monoclonal antibodies immunoreactive with the human 
5 herpes virus in an immunofluorescence assay. 

For virus isolation, the virus is either harvested 
directly from the culture fluid by direct 
centrifugation, or the infected cells are harvested, 
10 homogenized or lysed and the virus is separated from 
cellular debris and purified by standard methods cf 
isopycnic sucrose density gradient centrifugation. 

One skilled in the art may isolate and propagate the 
15 DNA herpesvirus associated with Kaposi's sarcoma 
(KSHV) employing the following protocol. Long-term 
establishment of a B lymphoid cell line infected with 
the KSHV from body- cavity based lymphomas (RCC-1 or 
BHL-6) is prepared extracting DNA from the Lymphoma 
20 tissue using standard techniques [27, 49, 66] . 

The KS associated herpesvirus may be isolated from the 
cell DNA in the following manner. An infected cell 
line (BHL-6 RCC-1), which can be lysed using standard 
2 5 methods such as hyposomatic shocking and Dounce 

■„__.:__ = - - ^- = - De iie:ed at 2000xg for 10 
homogenization , is - s - P e -- C " L ' 

minutes, the supernatant is removed and centrifuged 
again at 10,000xg for 15 minutes to remove nuclei and 
organelles. The supernatant is filtered through a 
0.45m filter and centrifuged again at 100, OOOxg for 1 
hour to pellet the virus. The virus car. then be 
washed and centrifuged again at lOO.COOxg for 1 hour. 

The DNA is tested for the presence cf the KSHV by 
Southern blotting and PCR using the specific probes as 
described hereinafter. Fresh lymphoma tissue 

containing viable infected cells is simultaneously 
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filtered to form a single cell suspension by standard 
techniques [49, 66] . The cells are separated by 
standard Ficoll-Plaque centrif ugation and lymphocyte 
layer is removed. The lymphocytes are then placed at 
5 >lxlO £ cells/ml into standard lymphocyte tissue culture 

medium, such as RMP 1640 supplemented with 10% fetal 
calf serum. Immortalized lymphocytes containing the 
KSHV virus are indefinitely grown in the culture media 
while nonimmortilized cells die during course of 
10 prolonged cultivation. 

Further, the virus may be propagated in a new cell 
line by removing media supernatant containing the 
virus from a continuously infected cell line at a 

15 concentration of >lxl0 6 cells/ml. The media is 

centrifuged at 2000xg for 10 minutes and filtered 
through a 0.45/i filter to remove cells. The media is 
applied in a 1:1 volume with cells growing at >lxl0 e 
cells/ml for 48 hours. The cells are washed and 

20 pelleted and placed in fresh culture medium, and 

tested after 14 days of growth. 

RCC-1 and RCC-1 2F = were deposited on October 19, 1994 
under ATCC Accession No. CRL 11734 and CRL 11735, 

25 respectively-, pursuant to the Budapest Treaty on the 

International Deposit of Microorganisms for the 
Purposes of Patent Procedure with the Patent Culture 
Depository of the American Type Culture Collection, 
123 01 Parklawn Drive, Rockvilie, Maryland 20S52 U.S.A. 

30 BKL- 6 was deposited on November 16, 1994 under ATCC 

Accession No. CRL 11762 pursuant to the Budapest 
Treaty on the International Deposit of Microorganisms 
for the Purposes cf Patent Procedure with the Patent 
Culture Depository of the American Type Culture 

35 Collection, 12301 Parklawn Drive, Rockvilie, Maryland 

20852 U.S.A. 
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C. t m gir^ Triftntirv of the Virus 

The KS-associated human herpesvirus can also be 
described immunologically. KS-associated human 

herpesviruses are selectively immuncreact ive to 
antisera generated against a defined immunogen such as 
the viral major capsid protein depicted in Sea. ID No. 
12, herein. Immunoreact ivity is determined in an 
immunoassay using a polyclonal antiserum which was 
raised to the protein which is encoded by the amino 
acid. sequence or nucleic acid sequence of SEQ ID NOs : 
18-20. This antiserum is selected to have low 
crossreactivity against other herpes viruses and any 
such crossreactivity is removed by immunoabsorbt ion 
15 prior to use in the immunoassay. 

In order to produce antisera for use in an 
immunoassay, the protein which is encoded by the amino 
acid sequence or nucleic acid of SEQ ID NOs: 18-20 is 
isolated as described herein. For example, 

recombinant protein can be produced in a mammalian 
cell line. An inbred strain of mice such as balb/c is 
immunized with the protein which is encoded by the 
ammo acid sequence or nucleic acid of SEQ ID NOs: 2- 
3 7 using a standard adjuvant, such as Freund's 
adjuvant, and a standard mouse immunization protocol 
(see [32] , supra) . Alternatively, a synthetic peptide 
derived from the sequences disclosed herein and 

_ - _ _ _ _ _, , 0 ^ nrnre^. can be used an 
conjugated to a ca..ie. F- ULe -' 

30 immunogen. Polyclonal sera are collected and titerea 

against the immunogen protein m an immunoassay, for 

example, a solid phase immunoassay with the immunogen 

immobilized on a solid support. Polyclonal antisera 

with a titer of 10* or greater are selected and tested 

for their cross reactivity against ether viruses or 

the gammaherpesvirinae subfamily, particularly human 

heroes virus types 1-7, by using a standard 
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immunoassay as described in [32] , supra. These ether 
gammaherpesvirinae virus can be isolated by standard 
techniques for isolation herpes viruses as described 
herein . 

5 

The ability of the above viruses to compete with the 
binding of the antisera to the immunogen protein is 
determined. The percent crossreact ivity for other 
viruses is calculated, using standard calculations. 
10 Those antisera with less than 10% crossreact ivity with 

each of the other viruses listed above is selected and 
pooled. The cross -reacting antibodies are then 
removed from the pooled antisera by immunoabsorpt ion 
with the above- listed viruses. 

15 

The immunoabsorbed and pooled antisera are then used 
in a competitive binding immunoassay procedure as 
described above to compare an unknown virus 
preparation to the specific KS herpesvirus preparation 

20 described herein and containing the nucleic acid 

sequence described in SEQ ID NOs : 2-37. In order to 
make this comparison, the immunogen protein which is 
encoded by the amino acid sequence or nucleic acid of 
SEQ ID NOs : 2-37 is the labeled antigen and the virus 

25 preparations are each assayed at a wide range of 

concentrations. The amount of each virus preparation 
required to inhibit 50% of the binding of the antisera 
to the labeled immunogen protein is determined. Those 
viruses that specifically bind to an antibody 

3 0 generated to an immunogen consisting cf the protein of 

SEQ ID NOs : 2-37 are those virus where the amount of 
virus needed to inhibit 50% of the binding to the 
protein does not exceed an established amount. This 
amount is no more than 10 times the amount cf the 

35 virus that is needed for 50% inhibition for the KS - 

associated herpesvirus containing the DNA sequence cf 
SEQ ID NO: 1. Thus, the KS -associated herpesviruses 



WO 96/15779 



10 



15 



20 



25 



30 



35 



25 

of the invention can be defined by immunological 
comparison to the specific strain of the KS-associares 
herpesvirus for which nucleic acid sequences are 
provided herein. 

This invention provides, a nucleic acid molecule of at 
least 14 nucleotides capable of specifically 
hybridizing with the isolated DNA molecule- In one 
embodiment, the molecule is DNA. In another 

embodiment, the molecule is RNA. In another 

embodiment the nucleic acid molecule may be 14-20 
nucleotides in length. In another embodiment the 
nucleic acid molecule may be 16 nucleotides in length. 

This invention provides, a nucleic acid molecule of at 
least 14 nucleotides capable of specifically 
hybridizing with a nucleic acid molecule which is 
complementary to the isolated DNA molecule. In one 
embodiment, the molecule is DNA. In another 

embodiment, the molecule is RNA. 

The nucleic acid molecule of at leas" 14 nucleotides 
may hybridize with moderate stringency to at least a 
portion of a nucleic acid molecule with a sequence 
shown in Figures 3A-3F ( SEQ ID NOs : 1, 10-17, and 36- 
40) . 

High stringent hybridization conditions are selected 
at about 5° C lower than the thermal melting point 
(Tm) for the specific sequence at a defined ionic 
strength and pK . The Tm is the temperature (under 
defined ionic strength and pH; at which 50% of the 
target sequence hybridizes to a perfectly matchec 
probe. Typically, stringent conditions will be those 
in which the salt concentration is at least about 0.02 
molar at pK 7 and the temperature is at least about 
60°C. As other factors may significantly ariect tne 
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stringency of hybridization, including, among others, 
base composition and size cf the complementary 
strands, the presence of organic solvents, ie. salt 
or formamide concentration, and the extent cf base 
5 mismatching, the combination cf parameters is more 

important than the absolute measure of any one. For 
Example high stringency may be attained for example by 
overnight hybridization at about 68 °C in a 6x SSC 
solution, washing at room temperature with 6x SSC 
10 solution, followed by washing at about 68°C in a 6x 
SSC in a 0 . 6x SSX solution. 

Hybridization with moderate stringency may be attained 
for example by: 1) filter pre -hybridizing and 

15 hybridizing with a solution of 3x sodium chloride, 

sodium citrate (SSC), 50% formamide, 0 . 1M Tris buffer 
at Ph 7.5, 5x Denhardt ' s solution; 2.) pre- 

hybridization at 37°C for 4 hours; 3) hybridization 
at 37 °C with amount of labelled probe equal to 

20 3,000,000 cpm total for 16 hours; 4) wash in 2x SSC 

and 0.1% SDS solution; 5) wash 4x for 1 minute each 
at room temperature at 4x at 6 0 ° C for 3 0 minutes each; 
and 6) dry and expose to film. 

25 The phrase "selectively hybridizing tc " refers to a 

nucleic acid probe that hybridizes, duplexes or binds 
only to a particular target DNA or RNA sequence when 
the target sequences are present in a preparation of 
total cellular DNA or RNA. 3y selectively hybridizing 

30 it is meant that a probe binds to a given target in a 

manner that is detectable in a different manner from 
non- target sequence under high stringency conditions 
of hybridization, in a different "Complementary" or 
"target" nucleic acid seouences refer tc those nucleic 

3 5 acid sequences which selectively hybridize tc a 

nucleic acid prcbe . Proper annealing conditions 
deoend. for examole, uoon a crcbe ' s ier.-th , base 
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comoosition, and the number of mismatches and tr.eir 
position on the probe, and must often be determines 
empirically. For discussions of nucleic acid prone 
design and annealing conditions, see. for example, 
Sambrook et al . , [81] or Ausubel, F., et al . , [8]. 

It will be readily understood by those skilled in the 
art and it is intended here, that when reference is 
made to particular sequence listings, such reference 
includes sequences which substantially correspond to 
its complementary sequence and those described 
including allowances for minor sequencing errors, 
single base changes, deletions, substitutions and the 
like, such that any such sequence variation 
corresponds to the nucleic acid sequence of the 
pathogenic organism or disease marker to which the 
relevant sequence listing relates. 

Nucleic acid probe technology is well known to those 
skilled in the art who readily appreciate that such 
probes may vary greatly in length ana may be labeled 
with a detectable label, such as a radioisotope or 
fluorescent dye, to facilitate detection of the probe. 
DNA probe molecules may be produced by insertion of a 
DNA molecule having the full-length or a fragment of 
the isolated nucleic acid molecule of the DNA virus 
into suitable vectors. such as piasmids or 
bacterioohages, followed by transforming into suitable 
bacterial host cells, replication in the transformed 
bacterial host ceils and harvesting cf the DNA probes, 
usina methods well known in the art. Alternatively, 
probes may be generated chemically from DNA 
synthesizers . 

DNA virus nucleic acid rearrangements/mutations may be 
detected by Southern blotting, smgie stranae^ 
conformational polymorphism gel electrophoresis 
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(SSCP) , PCR or other DNA based techniques, or for RNA 
species by Northern blotting, PCR or other RNA -based 
techniques . 

5 RNA probes may be generated by inserting the full 

length or a fragment of the isolated nucleic acid 
molecule of the DNA virus downstream of a 
bacteriophage promoter such as T3 , T7 or SP6 . Large 
amounts of RNA probe may be produced by incubating zhe 
10 labeled nucleotides with a linearized isolated nucleic 

acid molecule of the DNA virus or its fragment where 
it contains an upstream promoter in the presence of 
the appropriate RNA polymerase. 

15 As defined herein nucleic acid probes may be DNA or 

RNA fragments. DNA fragments can be prepared, for 
example, by digesting plasmid DNA, or by use of PCR, 
or synthesized by either the phosphoramidite method 
described by Beaucage and Carruthers, [19] , or by the 

20 triester method according to Matteucci , et a.1 . , [62] , 

both incorporated herein by reference. A double 
stranded fragment may then be obtained, if desired, by 
annealing the chemically synthesized single strands 
together under appropriate conditions or by 

25 synthesizing the complementary strand using DNA 

polymerase with an appropriate primer sequence . Where 
a specific sequence for a nucleic acid probe is given, 
it is understood that the complementary strand is also 
identified and included. The complementary strand 

30 will work equally well in situations where the target 

is a double - stranded nucleic acid. Z z is also 
understood that when a specific sequence is identified 
for use a nucleic probe, a subsequence cf the listed 
sequence which is 25 basepairs cr more in length is 

3 5 also encomoassed for use as a orobe . 
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The DNA molecules of the subject inversion also 
include DNA molecules coding for polypeptide analogs, 
fragments or derivatives of antigenic polypeptides 
hich differ from naturally- occurring forms in terms 
_f the identity or location of one or more ammo acid 
residues (deletion analogs containing less than all of 
the residues specified for the protein, substitution 
analogs wherein one or more residues, specified are 
replaced by other residues and addition analogs where 
10 in one or more amino acid residues is added to a 

terminal or medial portion of the polypeptides) and 
which share some or all properties of naturally- 
occurring forms. These molecules include: the 
incorporation of codons "preferred" for expression by 
selected non-mammalian hosts; the provision of sites 
for cleavage by restriction endonuclease enzymes; and 
the provision of additional initial, terminal or 
intermediate DNA sequences that facilitate 
construction of readily expressed vectors. 

20 

This invention provides for an isolated DNA molecule 
which encodes at least a portion of a Kaposi's sarcoma 
associated herpesvirus: virion polypeptide 23, major 
capsid protein, capsid proteins, thymidine kinase, or 
25 tegument protein. 

This invention also provides a method of producing a 
oolypeptide encoded by isolated DNA molecule, which 
comprises growing the above host vector system under 
suitable conditions permitting production of the 
polypeptide and recovering the polypeptide so 
produced . 

This invention provides an isolated peptide encoded by 
the isolated DNA molecule associated with Kaposi's 
sa-coma. In one embodiment the peptide may be a 
ocivoeotide. Further, this invention provides a host 
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cell which expresses the polypeptide cf isolated DNA 
molecule. 

In one embodiment the isolated peptide or polypeptide 
is encoded by at least a portion of an isolated DNA 
molecule. In another embodiment the isolated peptide 
or polypeptide is encoded by at least a portion of a 
nucleic acid molecule with a sequence as set forth in 
(SEQ ID NOs : 2-37) . 



Further, the isolated peptide or polypeptide encoded 
by the isolated DNA molecule may be linked to a second 
nucleic acid molecule to form a fusion protein by 
expression in a suitable host cell. In one embodiment 
15 the second nucleic acid molecule encodes beta- 

galactosidase . Other nucleic acid molecules which are 
used to form a fusion protein are known to those 
skilled in the art. 

2 0 This invention provides an antibody which specifically 

binds to the peptide or polypeptide encoded by the 
isolated DNA molecule. In one embodiment the antibody 
is a monoclonal antibody. In another embodiment the 
antibody is a polyclonal antibody. 

25 

The antibody or DNA molecule may be labelled with a 
detectable marker including, but not limited to: a 
radioactive label, or a coiorimetric , a luminescent, 
or a fluorescent marker, or gold. Radioactive labels 
30 include, but are not limited to: :, H , :-i C , 3: P , 33 ? ; 3 ~ S , 

3 °C1, s: Cr, =" Co, E - Co, 5 - Fe; : Y 1 , 2 ' *v I , sNu Re. 
Fluorescent markers include but are not limited to: 
fluorescein, rhodamine and auramine . Coiorimetric 
markers include, but are not limited to: bictin, and 

3 5 digoxigenin. Methods of producing the polyclonal or 

monoclonal antibody are known to those cf ordinary 
ski 1 1 in the art . 
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Fu-th—, the antibody or nucleic acid molecule compiex 
may be detected by a second antibody which may oe 
linked to an enzyme, such as alkaline phosphatase or 
horseradish peroxidase. Other enzymes whicn may oe 
employed are well known to one of ordinary skiU m 
the art . 

This invention provides a method to select specific 
regions on the polypeptide encoded by the isolated DNA 
molecule of the DNA virus to generate antibodies. 
The orotein sequence may be determined from the cDNA 
sequence. Amino acid sequences may be analyzed by 
methods well known to those skilled in the art to 
determine whether they produce hydrophobic or 
hydroDhilic regions in the proteins which they build, 
in the case of cell membrane proteins, hydrophobic 
regions are well known to form the part of the protein 
char is inserted into the lipid biiayer of the cell 
membrane, while hydrophilic regions are located on the 
cell surface, in an aqueous environment. Usually, tne 
hydroohilic regions will be more immunogenic than the 
hydrophobic regions. Therefore the hydrophilic amino 
acid sequences may be selected and used to generate 
antibodies specific to polypeptide encoded by the 
isolated nucleic acid molecule encoding the DNA virus. 
The selected peptides may be prepared using 
commercially available machines. As an alternative, 
DNA -such as a cDNA or a fragment thereof, may oe 
cloned and expressed and the resulting polypeptide 
recovered and used as an immunoger. . 

Polyclonal antibodies against these peptides may ^ be 
produced by immunizing animals using the seiectec 
peptides. Monoclonal antibodies are prepared using 
nybridoma technology by fusing antibody producing a 
ceils from immunized animals with myeloma cells ana 
selecting the resulting r.ybr idorr.a cell line producing 
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the desired antibody. Alternatively, monoclonal 
antibodies may be produced by in vitro techniques 
known to a person of ordinary skill in the art . These 
antibodies are useful to detect the expression of 
5 polypeptide encoded by the isolated DNA molecule of 

the DNA virus in living animals, in humans, or in 
biological tissues or fluids isolated from animals or 
humans . 



10 II . Immunoassays 



The antibodies raised against the viral strain or 
peptides may be detectably labelled, utilizing 
conventional labelling techniques well-known to the 
15 art. Thus, the antibodies may be radiolabelled using, 

for example, radioactive isotopes such as 3 H, :25 I , 13i I, 
and 35 S. 

The antibodies may also be labelled using fluorescent 
20 labels, enzyme labels, free radical labels, or 

bacteriophage labels, using techniques known in the 
art . Typical fluorescent labels include fluorescein 
isothiocyanate , rhodamine , phycoerythrin , phycocyanin, 
alophycocyanin, and Texas Red. 

25 

Since specific enzymes may be coupled to other 
molecules by covalent links, the possibility also 
exists that they might be used as labels for the 
production of tracer materials. Suitable enzymes 

30 include alkaline phosphatase, beta -galactosidase , 

glucose - 6 - phosphate dehydrogenase, maleate 

dehydrogenase, and peroxidase. Two principal types of 
enzyme immunoassay are the enzyme - linked immunosorbent 
assay {ELISA) , and the homogeneous enzyme immunoassay, 

35 also known as enzyme-multiplied immunoassay (EMIT, 

Syva Corporation, Palo Alto, CA) . In the ELISA 
system, separation may be achieved, fcr example, by 
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the us- of antibodies coupled to a solid phase. The 
vy-'T system depends on deactivation of the enzyme ir. 
the tracer -antibody complex; the activity can thus be 
measured without the need for a separation step. 

Additionally, chemiluminescent compounds may be used 
as labels. Typical chemiluminescent compounds include 
luminol, isoluminol, aromatic acridinium esters, 
imidazoles, acridinium salts, and oxalate esters. 
Similarly, bioluminescent compounds may be utilized 
for labelling, the bioluminescent compounds including 
luciferin, luciferase, and aequorin. 

one- labeled, the antibody may be employed to identify 
and quantify immunologic counterparts (antibody or 
antigenic polypeptide) utilizing techniques well-known 

to the arc. 

A description of a radioimmunoassay CRIA) may be found 
in Laboratory Techniques in Biochemiszry and Molecular 
„• - r--5i wi-h na—i-ular reference to the chapter 

-nt^tled "An Introduction to Radioimmune Assay and 
Related Techniques" by Chare, T. , ■ incorporated by 
reference herein. 

A description of general immunometric assays of 
various tvoes can be found in the following U.S. Pat. 
Kos. 4,376,110 (David et al . ) or 4.095.676 (Piasio). 



15 



2C 



25 



30 A. 



t^avs fr- v-ra- antiaens 



* n addition to the detection of the causal agent using 
nucleic acid hybridization technology, one car. use 

virus, soec:::c 

immunoassays to ae^ec. — 
35 peptides, cr for antibodies to the virus cr?s?::aes. 

A aeneral overview of the applicable tecnnc.ogy is 
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Harlow and Lane [32] , incorporated by reference 
herein . 

In one embodiment, antibodies to the human herpesvirus 
5 can be used to detect the agent in the sample. In 

brief, to produce antibodies to the agent or peptides, 
the sequence being targeted is expressed in 
transfected cells, preferably bacterial cells, and 
purified. The product is injected into a mammal 

10 capable of producing antibodies. Either monoclonal or 

polyclonal antibodies (as well as any recombinant 
antibodies) specific for the gene product can be used 
in various immunoassays. Such assays include 

competitive immunoassays, radioimmunoassays, Western 

15 blots, ELISA, indirect immunof luorescent assays and 

the like. For competitive immunoassays, see Harlow 
and Lane [32] at pages 567-573 and 584-589. 

Monoclonal antibodies or recombinant antibodies may be 

20 obtained by various techniques familiar to those 

skilled in the art. Briefly, spleen cells or other 
lymphocytes from an animal immunized with a desired 
antigen are immortalized, commonly by fusion with a 
myeloma cell (see, Kohler and Milstein [50], 

25 incorporated herein by reference) . Alternative 

methods of immortalization include transformation with 
Epstein Barr Virus, oncogenes, or retroviruses, or 
other methods well known in the art. Colonies arising 
from single immortalized cells are screened for 

30 production of antibodies of the desired specificity 

and affinity for the antigen, and yield of the 
monoclonal antibodies produced by such cells may be 
enhanced by various techniques, including injection 
into the peritoneal cavity of a vertebrate host . New 

35 techniques using recombinant phage antibody expression 

systems can also be used to generate monoclonal 
antibodies. See fcr example: McCafferty, J ez al . 
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[643 ; Hoogenboom, K.R. et al - [39]; and Marks, J.r. er 
al. [60] . 

Such peptides may be produced by expressing the 
specific sequence in a recombinant ly engineered cell 
such as bacteria, yeast, filamentous fungal, insect 
(especially employing baculoviral vectors), and 
mammalian cells. Those of skill in the art are 
knowledgeable in the numerous expression systems 
available for expression of herpes virus protein. 

Briefly, the expression of natural or synthetic 
nucleic acids encoding viral protein will typically be 
achieved by operably linking the desired sequence or 
portion thereof to a promoter {which is either 
constitutive or inducible), and incorporated into an 
exoression vector. The vectors are suitable for 
replication or integration in either prokaryotes or 
eukaryotes. Typical cloning vectors contain 

antibiotic resistance markers, genes for selection of 
transf ormants , inducible or reguiatabie promoter 
regions, and translation terminators that are useiul 
for the expression of viral genes. 

Methods for the expression of cloned genes in bacteria 
are also well known. In general, to obtain high level 
expression of a cloned gene in a prokaryotic system, 
in is advisable to construct expression vectors 
containing a strong promoter to direct mRNA 
transcription. The inclusion of selection markers in 
DNA vectors transformed m E. cell is also userui . 

_ i £ 0 . . ^ rr = >- k- » ^ <=; - nciude aenes specirymg 

Examples o. sue. ma^Ke-s _.iwj.uwi- 

resistance to antibiotics. See [SI] supra, for 
details concerning selection markers and promoters rcr 
use in £. coli . Suitable eukaryote hosts may include 
olanr cells, insect ceils, mammalian cells, yeast, ana 
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Methods for characterizing naturally processed 
peptides bound tc MHC {major histocompatibility 
complex) I molecules have been developed. See, Falk 
et al . [24], and PCT publication No. WO 92/21033 
5 published November 26, 1992, both of which are 

incorporated by reference herein. Typically, these 
methods involve isolation of MHC class I molecules by 
immunoprecipitation or affinity chromatography from an 
appropriate cell or cell line. Other methods involve 

10 direct amino acid sequencing of the more abundant 
peptides in various HPLC fractions by known automatic 
sequencing of peptides eluted from Class I molecules 
of the B cell type (Jardetzkey, et al . [45], 
incorporated by reference herein, and of the human MHC 

15 class I molecule, HLA-A2 . 1 type by mass spectrometry 

(Hunt, et al . [40], incorporated by reference herein) . 
See also, Rotzschke and Falk [79] , incorporated by 
reference herein for a general review of the 
characterization of naturally processed peptides in 

20 MHC class I. Further, Marloes, et al . [61], 

incorporated by reference herein, describe how class 
I binding motifs can be applied to the identification 
of potential viral immunogenic peptides _in vitro . 

25 The peptides described herein produced by recombinant 

technology may be purified by standard techniques well 
known to those cf skill in the art. Recombinant ly 
produced viral sequences can be directly expressed or 
expressed as a fusion protein. The protein is then 

30 purified by a combination of cell ' lysis (e.g., 

sonication) and affinity chromatography. -or fusion 
products, subsequent digestion of the fusion protein 
with an appropriate proteolytic enzyme releases the 
desired peptide. 

35 

The proteins may be purified to substantial purity by 
standard techniques well known in the art, including 
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selective precipitat ion with such substances as 
ammonium suliate, ^oiumn 

immunopurification methods , and others . See, for 
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instance 
reference . 



.Scopes, R. 184], incorporated herein by 



B. Q-mmniMl for rh* presence or 

.nrihnHiss CO r'np humar homssvirus. 

This invention further embraces diagnostic kits for 
detecting the presence of a KS agent in biological 
samples, such as serum or solid tissue samples, 
comprising a container containing antibodies to the 
human herpesvirus, and instructional material tor 
performing the test. Alternatively, inactivated viral 
particles or peptides or viral proteins derived from 
"the human herpesvirus may be used in a diagnostic kit 

to detect for antibodies specific to the KS associated 

human herpesvirus . 



Diagnostic kits for detecting the presence of a KS 
aaent in tissue samples, such as skin samples or 
samples of other affected tissue, comprising a 
container containing a nucleic acid sequence specific 
25 for the human herpesvirus and instructional material 

for detecting the KS-associated herpesvirus are also 
included. A container containing nucleic acid primers 
to any one of such sequences is optionally inciuaec as 
are antibodies to the human herpesvirus as described 
30 herein. 

Antibodies reactive with antigens of the human 
herpesvirus can also be measured by a variety of 
immunoassay methods that are similar := che procedures 
35 described above for measurement cf antigens. ror a 

review of immunological and immunoassay procedures 
aooiicabie to the measurement cf antioccies oy 
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immunoassay techniques, see Basic and CI. 
Immunology 7th Edition [12], and [32], supra. 

In brief, immunoassays to measure antibodies reactive 
5 with antigens of the KS- associated human herpesvirus 

can be either competitive or noncompetitive binding 
assays. In competitive binding assays, the sample 
analyte competes with a labeled analyte for specific 
binding sites on a capture agent bound to a solid 

10 surface. Preferably the capture agent is a purified 

recombinant human herpesvirus protein produced as 
described above. Other sources of human herpesvirus 
proteins, including isolated or partially purified 
naturally occurring protein, may also be used. 

15 Noncompetitive assays are typically sandwich assays, 

in which the sample analyte is bound between two 
analyte-specif ic binding reagents. One of the binding 
agents is used as a capture agent and is bound to a 
solid surface. The second binding agent is labelled 

20 and is used to measure or detect the resultant complex 

by visual or instrument means. A number of 

combinations of capture agent and labelled binding 
agent can be used. A variety of different immunoassay 
formats, separation techniques and labels can be also 

25 be used similar to those described above for the 

measurement of the human herpesvirus antigens. 

Hemagglutination Inhibition (KI) and Complement 
Fixation (CF) which are two laboratory tests that can 
3 0 be used to detect infection with human herpesvirus by 

testing for "he presence of antibodies against the 
virus or antigens of the virus. 

Serological methods can be also be useful when one 
35 wishes to detect antibody to a specific variant. For 

example, one may wish t c see how well a vaccine 
recioient has resocnded tc the new variant. 
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Alternatively, one may take serum from a patient to 
see which variant the patient responds to the best . 

This invention provides an antagonist capable c: 
blocking the expression of the peptide or polypeptide 
encoded by the isolated DNA molecule. In one 
embodiment the antagonist is capable of hybridizing 
with a double stranded DNA molecule. In another 
embodiment the antagonist is a triplex oligonucleotide 
caoable of hybridizing to the DNA molecule. In 
another embodiment the triplex oligonucleotide is 
capable of binding to at least a portion of the 
isolated DNA molecule with a nucleotide sequence as 
shown in Figure 3A-3F (SEQ ID NOs : 1-37) . 

This invention provides an antisense molecule capable 
of hybridizing to the isolated DNA molecule. In one 
embodiment the antisense molecule is DNA. In another 
embodiment the antisense molecule is RNA. 

The antisense molecule may be DNA cr RNA or variants 
thereof (i.e. DNA or RNA with a protein backbone). 
The present invention extends to the preparation of 
antisense nucleotides and ribozymes that may be used 
25 to interfere with the expression of the receptor 

recognition proteins at the translation of a specific 
mRNA~ either by masking that MRNA with an antisense 
nucleic acid or cleaving it with a ribozyme . 



Antisense nucleic acids are DNA or RNA molecules that 
are complementary to at least a portion cf a specific 
MRNA molecule. In the cell, they hybridize to that 
MRNA , forming a double stranded molecule. The cell 
does not translate an MRNA in this double-stranded 
35 form. Therefore, antisense nucleic acids interiere 

with the expression of MRNA into protein. Oligomers 
cf about fifteen nucleotides and molecules that 
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hybridize to the AUG initiation codon are particularly 
efficient, since they are easy to synthesize and are 
likely to pose fewer problems than larger molecules 
upon introduction to cells. 

5 

This invention provides a transgenic nonhuman mammal 
which comprises at least a portion of the isolated DNA 
molecule introduced into the mammal at an embryonic 
stage- Methods of producing a transgenic nonhuman 
10 mammal are known to those skilled in the art. 

This invention provides a cell line containing the 
isolated KS associated herpesvirus of the subject 
invention. In one embodiment the isolated DNA 

15 molecule is artificially introduced into the cell. 

Cell lines include, but are not limited to: 
fibroblasts, such as HFF, NIH/3T3 ; Epithelial cells, 
such as 5637; lymphocytes, such as FCB ; T- cells, such 
as CCRF-CEM {ATCC CCL 119) ; B-cells, such as BJA3 and 

20 Raji (ATCC CCL 86); and myeloid cells such as K562 

{ATCC CCL 243); Vero cells and carcinoma cells. 
Methods of producing such cell lines are known to 
those skilled in the art. In one embodiment the 
isolated KS associated herpesvirus is introduced into 

25 a RCC-1 cell line. 

III. In vitro diagnostic assavs for the detection of 
KS 

30 This invention provides a method cf diagnosing 

Kaposi's sarcoma in a subject which comprises: (a) 
obtaining a nucleic acid molecule from a tumor lesion 
of the subject; (b) contacting the nucleic acid 
molecule with a labelled nucleic acid molecule cf at 

35 least 15 nucleotides capable of specifically 

hybridizing with the isolated DNA, under hybridizing 
conditions; and (c) determining the presence of the 
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nucleic acid molecule hybridized, uhe presence cf 
which is indicative of Kaposi's sarcoma in tne 
subject, thereby diagnosing Kaposi's sarcoma in the 
subject . 

5 

In one embodiment the DNA molecule from the tumor 
lesion is amplified before step (b) . In another 
embodiment PCR is employed to amplify the nucleic acid 
molecule. Methods of amplifying nucleic acid 

10 molecules are known to those skilled in the art. 

A person of ordinary skill in the art will be able to 
obtain appropriate DNA sample for diagnosing Kaposi's 
sarcoma in the subject. The DNA sample obtained by 
the above described method may be cleaved by 
restriction enzyme. The uses of restriction enzymes 
to cleave DNA and the conditions to perform such 
cleavage are well-known in the art. 



15 



20 

In the above described methods, a size fractionation 
may be employed which is effected by a polyacryiamide 
gel. In one embodiment, the size fractionation is 
effected by an agarose gel. Further, transferring the 
25 DNA fragments into a solid matrix may be employed 

before a hybridization step. One example cf such 
solid matrix is nitrocellulose paper . 

This invention provides a method cf diagnosing 
30 Kaposi's sarcoma in a subiect which comprises : (a) 

obtaining a nucleic acid molecule from a suitable 
bodily fluid of the subject; lb) contacting the 
nucleic acid molecule with a labelled nucleic acia 

-i _,c -~ i pa c- nucleotides caoabie cz 
molecules o- a^ ieas_ 

25 specifically hybridizing with the isolated DNA, under 

hybridizing conditions; and to; determining the 
presence of the nucleic acid molecule hybricizec, tne 
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presence of which is indicative of Kaposi's sarcoma in 
the subject, thereby diagnosing Kaposi's sarcoma :r. 
the subject. 

5 This invention provides a method of diagnosing a DNA 

virus in a subject, which comprises (a) obtaining a 
suitable bodily fluid sample from the subject, (b) 
contacting the suitable bodily fluid of the subject to 
a support having already bound thereto a Kaposi's 

10 sarcoma antibody, so as to bind the Kaposi's sarcoma 

antibody to a specific Kaposi's sarcoma antigen, (c) 
removing unbound bodily fluid from the support, and 
(d) determining the level of Kaposi's sarcoma antibody 
bound by the Kaposi's sarcoma antigen, thereby 

15 diagnosing the subject for Kaposi's sarcoma. 

This invention provides a method of diagnosing 
Kaposi's sarcoma in a subject, which comprises (a) 
obtaining a suitable bodily fluid sample from the 

20 subject, (b) contacting the suitable bodily fluid of 

the subject to a support having already bound thereto 
a Kaposi's sarcoma antigen, so as to bind Kaposi's 
sarcoma antigen to a specific Kaposi's sarcoma 
antibody, (c) removing unbound bodily fluid from the 

25 support, and (d) determining the level of the Kaposi's 

sarcoma antigen bound by the Kaposi's sarcoma 
antibody, thereby diagnosing Kaposi's sarcoma. 



3 0 This invention provides a method " of detecting 

expression of a DNA virus associated with Kaposi's 
sarcoma in a cell which comprises obtaining total cDNA 
obtained from the cell, contacting the cDNA so 
obtained with a labelled DNA molecule under 
hybridizing conditions, determining the presence of 
cDNA hybridized :o the molecule, and thereby detecting 
the expression of the DNA virus. In one embodiment 
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mRNA is obtained from the cell to detect expression c£ 
the DNA virus . 

The suitable bodily fluid sample is any bodily fluid 
sample which would contain Kaposi's sarcoma antibody,^ 
antigen or fragments thereof. A suitable bodily fluid 
includes, but is not limited to: serum, plasma, 
cerebrospinal fluid, lymphocytes, urine, transudates, 
or exudates. In the preferred embodiment, the 
suitable bodily fluid sample is serum or plasma. In 
addition, the bodily fluid sample may be cells from 
bone marrow, or a supernatant from a cell culture. 
Methods of obtaining a suitable bodily fluid sample 
from a subject are known to those skilled in the art. 
Methods of determining the level of antibody or 
antigen include, but are not limited to: ELISA, IFA, 
and Western blotting. Other methods are known to 
those skilled in the art. Further, a subject infected 
with a DNA virus associated with Kaposi's sarcoma may 
be diagnosed with the above described methods. 



The detection cf the human herpesvirus and the 
detection of virus - associated KS are essentially 
identical processes. The basic principle is to detect 

25 the virus using specific Iigands that bind to the 

virus but not to other proteins or nucleic acids in a 
normal human ceil or its environs. The iigands can 
either be nucleic acid or antibodies. The iigands can 
be naturally occurring or genetically or physically 

30 modified such as nucleic acids with non-natural or 

; Fab or chimeric 

an:iDoay aer.vo.ives, -._.,-■» 

antibodies. Serological tests for ae-.ec-.ior. cf 
antibodies to -he virus may also be performed by using 
protein antigens obtained fro- the human herpesvirus. 
2 5 and described herein. 
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Samples can be taken from patients with KS cr from 
patients at risk for KS , such as AIDS patients. 
Typically the samples are taken from blood (cells, 
serum and/or plasma) or from solid tissue samples such 
5 as skin lesions. The most accurate diagnosis for KS 

will occur if elevated titers of the virus are 
detected in the blood or in involved lesions. KS may 
also be indicated if antibodies to the virus are 
detected and if other diagnostic factors for KS is 
10 present . 

A. Nucleic acid assays. 

The diagnostic assays of the invention can be nucleic 
15 acid assays such as nucleic acid hybridization assays 

and assays which detect amplification of specific 
nucleic acid to detect for a nucleic acid sequence of 
the human herpesvirus described herein. 

2 0 Accepted means for conducting hybridization assays are 

known and general overviews of the technology can be 
had from a review of: Nucleic Acid Hybridization: A 
Practical Approach [72]; Hybridization of Nucleic 
Acids Immobilized on Solid Supports [41] ; Analytical 
25 Biochemistry [4] and Innis et al . , PGR Protocols [74], 

supra, all of which are incorporated by reference 
herein . 

If PCR is used in conjunction with nucleic acid 
30 hybridization, primers are designed' to targe: a 

specific portion of the nucleic acid of the 
herpesvirus. For example, the primers set forth in 
SEQ ID NOs : 38-40 may be used to target detection of 
regions of the herpesvirus genome encoding ORF 25 

3 5 homologue - ORF 3 2 homclogue . From the information 

provided herein, those of skill in the art will be 
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receptor on the surface cf the target infected cell, 
and which is internalized after binding. 

iii) Administration 

The subjects to be treated or whose tissue may be used 
herein may be a mammal, or more specifically a human, 
horse, pig, rabbit, dog, monkey, or rodent. In the 
preferred embodiment the subject is a human. 



The compositions are administered in a manner 
compatible with the dosage formulation, and in a 
therapeutically effective amount. Precise amounts of 
active ingredient required to be administered depend 
15 on the judgment of the practitioner and are peculiar 

to each subject. 

Suitable regimes for initial administration and 
booster shots are also variable, but are typified by 
20 an initial administration followed by repeated doses 

at one or more hour intervals by a subsequent 
injection or other administration. 

As used herein administration means a method of 
25 administering to a subject. Such methods are well 

known to those skilled in the art and include, but are 
not limited to, administration topically, 
parenterally , orally, intravenously, intramuscularly, 
subcutaneously or by aerosol . Administration of the 
30 agent may be effected continuously or intermittently 

such that the therapeutic agent in the patient is 
effective to treat a subject with Kaposi's sarcoma or 
a subject infected with a DNA virus associated with 
KaDosi's sarcoma. 



The antiviral compositions for treating herpesvirus - 
induced KS are preferably administered tt human 
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acids induced by appropriately derivatized inhibitory 
nucleic acids may also be used. 

Cleavage, and therefore inactivation , of the target 
nucleic acids may be effected by attaching a 
substituent to the inhibitory nucleic acid which can 
be activated to induce cleavage reactions. Tne 
substituent can be one that affects either chemical, 
or enzymatic cleavage. Alternatively, cleavage can 
be induced by the use of ribozymes or catalytic RNA. 
in this approach, the inhibitory nucleic acids would 
comprise either naturally occurring RNA (ribozymes) or 
synthetic nucleic acids with catalytic activity. 

The targeting of inhibitory nucleic acids to specific 
cells of the immune system by conjugation with 
taraeting moieties binding receptors on the surface of 
these cells can be used for all of the above forms of 
inhibitory nucleic acid therapy. This invention 
encompasses ail of the forms of inhibitory nucleic 
acid therapy as described above and as described in 
Helene and Touime . 

This invention relates to the targeting of inhibitory 
nucleic acids to sequences the human herpesvirus of 
the invention for use in treating KS . An example of 
an antiherpes virus inhibitory nucleic acid is ISIS 
2922 ( ISIS Pharmaceuticals) which has activity against 
' CMV [see, Biotechnology News 14(14) p. 5]. 



A 



problem associated with inhibitory nucleic acia 
therapy is the effective delivery of the inhibitory 
nucleic acid to the target cell in vivo anc tne 
subseauent internalization of the inhibitory nucleic 
35 acid by that cell. This car. be accomplished by 

linking the inhibitory nucleic acid to. a taraeting 
moietv tc form a conjugate that binds to a speciric 
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More commonly, inhibitory nucleic acids are designed 
co bind to mRNA or mRNA precursors. Inhibitory 
nucleic acids are used to prevent maturation of pre- 
mRNA. Inhibitory nucleic acids may be designed to 
5 interfere with RNA processing, splicing or 

translation . 



The inhibitory nucleic acids can be targeted to mRNA. 
In this approach, the inhibitory nucleic acids are 

10 designed to specifically block translation of the 

encoded protein. Using this approach, the inhibitory 
nucleic acid can be used to selectively suppress 
. certain cellular functions by inhibition of 
translation of mRNA encoding critical proteins. For 

15 example, an inhibitory nucleic acid complementary to 

regions of c-myc mRNA inhibits c-myc protein 
expression in a human promyelocyt ic leukemia cell 
line, HL60, which overexpresses the c-myc proto- 
oncogene . See Wickstrom E.L., et al . [93] and 

20 Harel -Bellan , A., e: al . [31A] . As described in 

Helena and Toulme, inhibitory nucleic acids targeting 
mRNA have been shown to work by several different 
mechanisms to inhibit translation of the encoded 
protein ( s ) . 

25 

The inhibitory nucleic acids introduced into the cell 
can also encompass the "sense" strand of the gene or 
mRNA to trap or compete for the enzymes or binding 
proteins involved in mRNA translation. See Helens and 
3 0 Toulme. 

Lastly, the inhibitory nucleic acids can be used to 
induce chemical inactivation or cleavage of the target 
genes or mRNA. Chemical inactivation can occur by the 
3 5 induction of crosslinks between the inhibitory nucleic 

acid and the target nucleic acid within the cell. 
Other chemical modifications of the taraet nucleic 
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gene, although recently approaches for. use of "sense" 
nucleic acids have also beer, developed. The terrr. 
"inhibitory nucleic acids" as used herein, refers tc 
both "sense" and "antisense" nucleic acids. 

By binding to the target nucleic acid, the inhibitory 
nucleic acid can inhibit the function of the target 
nucleic acid. This could, for example, be a result or 
blocking DNA transcription, processing or poly (A) 
addition to mRNA, DNA replication, translation, or 
promoting inhibitory mechanisms of the cells, such as 
promoting RNA degradation. Inhibitory nucleic adc 
methods therefore encompass a number of different 
aooroaches to altering expression of herpesvirus 
15 ae'nes. These different types of inhibitory nucleic 
acid technology are described in Helene, C. and 
Toulme, J. 134], which is hereby incorporated by 
reference and is referred to hereinafter as "Helene 
and Toulme . " 



10 



20 



25 



30 



th b-ief, inhibitory nucleic acid therapy approaches 
can be classified into those that target DNA 
seouences, those that target RNA sequences (including 
pre -mRNA and mRNA) , those that target proteins (sense 
strand approaches), and those that cause cleavage or 
chemical modification 



>i mnriification oi the target nucleic acias . 



Accroaches targeting DNA fall into several categories. 
Nucleic acids can be designed to bind to the ma 3 or 
croove of the duplex DNA to form a triple helical or 
"triplex" structure. Alternatively, inhibitory 

nucleic acids are designed to bind to regions or. 
single stranded DNA resulting from the opening cf the 
duplex DNA during replication or transcription. See 
35 Helene and Toulme. 
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U.S. Patent No. 4,708,935 (Suhadolnik et aJ . ; Researc: 
Corporation) describes a 3' -deoxyadenosine compound 
effective in inhibiting HSV and EBV. U.S. Patent No. 
4,3 86,076 (Machida et al . Yamasa Shoyu Kabushiki 
5 Kaisha) describes use of 

(E) -5- (2-halogenovinyl) -arabinof uranosyluracil as an 
antiherpesvirus agent. U.S. Patent No. 4,340,599 
(Lieb et al . ,- Bayer Aktiengesellschaf t ) describes 
phosphonohydroxyacetic acid derivatives useful as 

10 antiherpes agents. U.S. Patent Nos . 4,093,715 and 

4,093,716 (Lin et al . Research Corporation) describe 
5 ' - amino - 5 ' - deoxy t hymi dine and 5-iodo-5'- 

amino-2 ' , 5 ' -dideoxycyt idine as potent inhibitors of 
herpes simplex virus. U.S. Patent No. 4,069,382 

15 (Baker et al . ; Parke, Davis & Company) describes 

9- (5-O-Acyl -beta-D-arabinof uranosyl ) adenine compounds 
useful as antiviral agents. U.S. Patent No. 3,927,216 
(Witkowski et al . ) describes the use of 
l,2,4-triazole-3-carboxamide and 

20 1 , 2 , 4 - triazole-3 - thiocarboxamide for inhibiting herpes 

virus infections. Patent No. 5,179,093 (Afonso et 
al . , Schering) describes quinoline - 2 , 4 - dione 

derivatives active against herpes simplex virus 1 and 
2, cytomegalovirus and Epstein Barr virus. 



25 



v) Inhibitory nucleic acid therapeutics 



Also contemplated here are inhibitory nucleic acid 
therapeutics which can inhibit the . activity of 

3 0 herpesviruses in patients with KS . Inhibitory nucleic 

acids may be single-stranded nucleic acids, which can 
specifically bind to a complementary nucleic acid 
sequence. By binding to the appropriate target 
sequence, an RNA-RNA, a DNA-DNA, or RNA-DNA duplex or 

35 triplex is formed. These nucleic acids are often 

termed " ant i sense " because they are usually 
complement arv to the sense or coding s z rar.d of the 
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Brovavir is an example of an antiviral deoxyuridme 
derivative of the type described m US Patent Nos . 
4,542,210 and 4,386,076. 

5 BHCG is an example of an antiviral carbocyclic 

nucleoside analogue of the type described in US Patent 
Nos. 5,153,352, 5,034,394 and 5,126,345. 

HPMPC is an example of an antiviral phosphorxyl 
methoxyalkyl derivative with of the type described in 
US Patent No. 5,142,051. 

.CDG (Carbocyclic 2 ' -deoxyguanosine ) is an example of 
an antiviral carbocyclic nucleoside analogue of the 
type described in US Patent Nos. 4,543,255, 4,855,466, 
and 4 , 894 ,458 . 
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Fosca 



rnet is described in US Patent No. 4,339,445. 



Trifiuridme and its corresponding ribonucieoside is 
described in US Patent No. 3,201,367. 



U.S. Patent No. 5,321,030 ( Kaddurah- Daouk e: al . ; 
Amira) describes the use of creatine analogs as 
antiherpes viral agents. U.S. Patent No. 5,306,722 
{Kim et al.; Bristol -Meyers Squibb) describes 
thymidine kinase inhibitors useful for treating HSV 
infections and for inhibiting herpes thymidine kinase. 
Other antiherpesvirus compositions are described in 
U.S. Patent Nos. 5,286,649 and 5,095,706 (Komshi et 
al., Bristol -Meyers Squibb) and 5,175,165 (Blumenkopf 
ec al.; Burroughs Wellcome). U.S. Patent No. 

4 880,820 (Ashton et al . ; Merck; describes the 
antiherpes virus agent < S > - S - ( 2 . 3 - dihyaroxy - 1 - 
35 propoxymethyi ) guanine . 
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generation derivatives will soon be available that 
will retain interferon's antiviral properties but have 
reduced side affects. 

5 It is also contemplated that herpes virus - induced KS 

may be treated by administering a herpesvirus 
reactivating agent to induce reactivation of the 
latent virus. Preferably the reactivation is combined 
with simultaneous or sequential administration of an 

10 anti -herpesvirus agent. Controlled reactivation over 

a short period of time or reactivation in the presence 
of an antiviral agent is believed to minimize the 
adverse effects of certain herpesvirus infections 
(e.g., as discussed in PCT Application WO 93/04683). 

15 Reactivating agents include agents such as estrogen, 

phorbol esters, forskolin and ^-adrenergic blocking 
agents . 

Agents useful for treatment of herpesvirus infections 

2 0 and for treatment of herpesvirus - induced KS are 

described in numerous U.S. Patents. For example, 
ganciclovir is an example of a antiviral guanine 
acyclic nucleotide of the type described in US Patent 
Nos. 4,355,032 and 4,603,219. 

25 

Acyclovir is an example of a class of antiviral purine 
derivatives, including 9 - ( 2 - 

hydroxyethylmethyl ) adenine , of the type described in 
U.S. Pat. Nos. 4,287,1B8, 4,294,831 and 4,199,574. 

30 

Brivudin is an example of an antiviral deoxyuridine 
derivative of the type described in US Patent No. 
4,424,211. 

3 5 Vidarabine is an example of an antiviral purine 

nucleoside of the type described in British Pat. 
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Merck)) as well as other enzymes. It will be apparent 
to one of ordinary skill in the art that tnere are 
additional, viral proteins, both characterized ananas 
yet to be discovered, that can serve as target for 
antiviral agents. 

iv) Other agents and modes of antiviral 
action . 

Kutapressm is a liver derivative available from 
Schwarz Parma of Milwaukee, Wisconsin in an injectable 
form of 25 mg/ml . The recommended dosage for 
herpesviruses is from 200 to 25 mg/ml per day for an 
avera ge adult of 150 pounds. 



Poly(I)-Poly(C 12 U), an accepted antiviral drug known as 
Ampligen from HEM Pharmaceuticals of Rockville, MD has 
been shown to inhibit herpesviruses and is another 
antiviral agent suitable for treating KS . Intravenous 
20 injection is the preferred route of administration. 

Dosages from about 100 to 600 mg/nr are administered 
two "to three times weekly to adults averaging 150 
pounds. It is best to administer at least 200 mg/m^ 
oer week. 



Other antiviral agents reported to show activity 
aaainst herpes viruses (e.g.. varicella zoster and 
herpes simplex) and will be useful for the treatment 
of herpesvirus- induced KS include mappicine ketone 
(SmithKline seecham) ; Compounds A.7S2SS ar.o A, 7^205 
(Abbott) for varicella zoster, and Compound 8B2CB7 
(Burroughs Wellcome) [see, The Pink Sheet 35(20! May 
17, 19S3] . 

Tn-rferor. is known inhibit replication of herpes 
viruses. See [733. supra. Interferon has known 
toxicity problems and it is expected that seccno 
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polymerase directly without processing by viral 
thymidine kinase. Foscarnet is reported to be less 
toxic than PAA. 

5 ii) Agents that target viral proteins ether 

than DNA polymerase or other viral 
functions. 

Although applicants do not intend to be bound by a 

10 particular mechanism of antiviral action, the 

antiherpes -virus agents described above are believed 
to act through inhibition of viral DNA polymerase. 
However, viral replication requires not only the 
replication of the viral nucleic acid but also the 

15 production of viral proteins and other essential 

components. Accordingly, the present invention 

contemplates treatment of KS by the inhibition of 
viral proliferation by targeting viral proteins other 
than DNA polymerase (e.g., by inhibition of their 

20 synthesis or activity, or destruction of viral 

proteins after their synthesis). For example, 

administration of agents that inhibit a viral serine 
protease, e.g., such as one important in development 
of the viral caps i d will be useful in treatment of 

25 viral induced KS . 

Other viral enzyme targets include: OMP decarboxylase 
inhibitors (a target of, e.g., parazofurin) , CTP 
synthetase inhibitors (targets of, e.g., 

30 cyclopenteny icytosine ) , IMP dehydrogenase, 

ribonucleotide reductase (a target cf , e.g., carboxyi- 
containing N- alkyldipept ides as described in U.S. 
Patent No. 5,110,799 (Tolman et al., Merck)), 
thymidine kinase (a target of, e.g., 1 - [ 2 - 
3 5 (hydro x y m e thyl) cycloalkylmethyl] -5 -substituted 

-uracils and -guanines as described in, e.g., U.S. 
Patent Nos . 4.863.927 and 4.762.062 (Tolmar. et al . ; 
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chlorodeoxyadenosme) is another nucleoside analogue 
known as a highly specific ant i lymphocyte agent {i.e., 
a immunosuppressive drug) . 

Other useful antiviral agents include: 5- thien-2-yl - 
2 ' -deoxyuridine derivatives, e.g., BTDU [ = -=>(=>- 
bromothien-2-yl) -2' -deoxyuridine] and CTDU [b-(5- 
chlorothien-2-yl) -2' -deoxyuridine] ; and OXT-A [9-(2- 
deoxy-2-hydroxymethyl^-D-erythro-oxetanosyl) adenine] 
and OXT-G [9- ( 2 -deoxy- 2 -hydroxymethyl - /3-D-erythro- 
oxetanosyl } guanine] . Although OXT-G is believed to 
act by inhibiting viral DNA synthesis its mechanism of 
action has not yet been elucidated. These and other 
compounds are described in Andrei ec al . [53 which is 
incorporated by reference herein. Additional 
antiviral purine derivatives useful in treating 
herpesvirus infections are disclosed in US Pat. 
5,108,994 {assigned to Beecham Group P.L.C.). 6- 
Methoxypurine arabinoside (ara-M; Burroughs Wellcome) 
is a potent inhibitor of varicella- zoster virus, and 
will be useful for treatment of KS . 

Certain thymidine analogs [e.g., idoxuridine (5-ido- 
2' -deoxyuridine) 3 and trif lurothymidine ) have 
antiheroes viral activity, but due to their systemic 
toxicity, are largely used for topical herpesviral 
infections, including HSV stromal keratitis and 
uveitis, and are not preferred here unless other 
options are ruled cut. 



Other useful antiviral agents that have demonstrated 
antiherpes viral activity include foscarnet sodium 
(tnsodium phosphonoformate, PFA, Foscavir (Astra)) 
and phosphonoacetic acid (?AA) - Foscarnet is an 
35 inorganic pyrophosphate analogue that acts oy 

competitively blocking the pyrophosphate - bincing site 
of "dNA polymerase. These agents which block DNA 
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terminator by the viral DNA polymerase during viral 
replication. It has therapeutic activity against a 
broad range of herpesviruses, Herpes simplex Types 1 
and 2, Varicella- Zoster, Cytomegalovirus, and 
5 Epstein-Barr Virus, and is used to treat disease such 

as herpes encephalitis, neonatal herpesvirus 
infections, chickenpox in immunocompromised hosts, 
herpes zoster recurrences, CMV retinitis, E3V 
infections, chronic fatigue syndrome, and hairy 

10 leukoplakia in AIDS patients. Exemplary intravenous 

dosages or oral dosages are 250 mg/kg/m : body surface 
area, every 8 hours for 7 days, or maintenance doses 
of 200-400 mg IV or orally twice a day to suppress 
recurrence. Ganciclovir has been shown to be more 

15 active than acyclovir against some herpesviruses. See, 

e.g., Oren and Soble [73] . Treatment protocols for 
ganciclovir are 5 mg/kg twice a day IV or 2.5 mg/kg 
three times a day for 10-14 days. Maintenance doses 
are 5-6 mg/kg for 5-7 days. 

20 

Also of interest is KPMPC . HPMPC is reported to be 
more active than either acyclovir or ganciclovir in 
the chemotherapy and prophylaxis of various HSV-l, 
HSV-2, TK- HSV, V2V or CMV infections in animal models 
25 ( [22] , supra) . 

Nucleoside analogs such as BVaraU are potent 
inhibitors of HSV- 1 , EBV, and V2V that have greater 
activity than acyclovir in animal models of 

30 encephalitis. FIAC ( f luroidoarbinosyl ' cytosine) and 

its related fluroethyl and iodc compounds (e.g., FEAU, 
FIAU) have potent selective activity against 
herpesviruses, and HPMPA ( (S) -1- ( [ 3 - hydroxy- 2 - 
phosphorylmethoxy] propyl ) adenine ) has been 

3 5 demonstrated to be more potent against HSV and CMV 

than acyclovir or ganciclovir and are of choice in 
advanced cases of KS . Cladribine (2- 
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amino-9- (4-acetoxy-3- (acetoxymethyl ) but - 1 -yl ) purine 
(Smithkline Beecham) ] ; valacyciovir (aurrougr.s 
Wellcome) ; desciclovir [ (2-amino-9- (2- 

ethoxymethyDpurine) ] and 2 - ammo - , - ( * - 

5 h ydroxyethoxymethyl)-9H-purine, prodrugs or 

acyclovir]; CDG (carbocyclic 2 ' -deoxyguanosme) ; and 
purine nucleosides with the pentaf uranosyl ring 
replaced by a cyclo butane ring (e.g., cyclobut-A [<*- 
)-9-[10,2a,3/3)-2.3-bis ( hydroxys thyl ) -1- 
10 cyclobutyl] adenine], cyclobut-G [ ( + - ) -9- [10. 2a, 30) - 
2, 3-bi S (hydroxymethyl)-l-cyclobutyl] guanine] , BHCG 
t (R)-(la,2/3,lQr)-9-' 2 ' 3 - 
bis (hydroxymethyl) cyclobutyl] guanine] , and an active 
Corner of racemic BHCG, SQ 34,514 [1R- 1« , 20 , 3a ) -2 - 
15 amino-9- [2, 3-bis (hydroxymethyl) cyclobutyl] - 6K-punn-6- 
one (see. Braitman et al.(1991) [20]]. Certain of 
these antiherpesviral agents are discussed in Gorach 
et al [26]; Saunders et ai . [82]; Yamanaka et al . , 
[96 ]; Greenspan et al . [29], all of which are 
20 incorporated by reference herein. 

Triciribine and triciribine monophosphate are potent 
inhibitors against herpes viruses. (Icfces et al . [43], 
incorporated by reference herein), KIV-l and HIV-2 
25 (Kucera et al . [51], incorporated by reference herein) 

and are additional nucleoside analogs that may be used 
to treat KS . An exemplary protocol for these agents 
is an intravenous injection of about 0.3 5 mg/meter-' 
(0.7 mg/kg) once weekly or every other week for at 
30 least two doses, preferably up to about four to eignt 

weeks . 

•,„,,;,. <-- --"-<=rest because or 
Acyclovir ana ganciclovir are c. => 

rh .^ accepted use in clinical settings. Acyclovir. 
35 an acyclic analogue of guanine, is phospnory latea oy 

a Teroesvirus thymidine kinase ana undergoes rurtner 
phosohorylatior. to be incorporated as a cnam 
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nucleoside analogs including acyclic nucleoside 
phosphonate analogs {e.g., 

phosphonylmethoxyalkylpurines and -pyrimidines ) , and 
cyclic nucleoside analogs. These include drugs such 
5 as: vidarabine ( 9-£-D-arabinof uranosyladenine ; adenine 

arabinoside, ara-A, Vira-A, Parke-Davis); 1-p-D- 
arabinofuranosyluracil ( a r a - U ) ; 1 - £ - D - 

arabinof uranosyl - cytosine (ara-C) ; HPMPC [(S)-l-[3- 
hydroxy- 2 - (phosphonylmethoxy) propyl j cytosine (e.g., GS 

10 504 Gilead Science)] and its cyclic form (cKPMPC) ,- 

HPMPA [(S)-9-(3-hydroxy-2- 
phosphonylmethoxypropyl ) adenine] and its cyclic form 
. { cHPMPA) ; (S) -HPMPDAP [ (S) -9- ( 3 - hydroxy - 2 - 

phosphonylmethoxypropyl ) -2 , 6 -diaminopurine] ; PMEDAP 

15 [9 - (2 -phosphonyl-methoxyethyl ) -2 , 6 -diaminopurine] ; HOE 

602 [2 -amino-9 - (1,3 - bis (isopropoxy) - 2 - 

propoxymethyl) purine]; PMEA [ 9 - ( 2 - 

phosphonylmethoxye thyl ) adenine] ; bromovinyl- 
deoxyuridine (Burns and Sandford. [21]); 1-/3-D- 

20 arabinof uranosyl -E- 5 -{ 2 -bromovinyl ) -uridine or -2'- 

deoxyuridine ; BVaraU { 1 - /3-D-arabinof uranosyl -E- 5 -( 2 - 
bromovinyl ) -uracil , brovavir , Bristol-Myers Squibb, 
Yamsa Shoyu) ; BVDU [ (E) -5- (2 -bromovinyl )- 2 ' - 
deoxyuridine , brivudin, e.g., Helpin] and its 

25 carbocyclic analogue (in which the sugar moiety is 

replaced by a cyclopentane ring); IVDU [ ( E ) - 5 - ( 2 - 
iodovinyl )- 2 '- deoxyuridine] and its carbocyclic 
analogue, C-IVDU (Balzarini er al . [11])]; and 5- 
mercutithio analogs of 2 '- deoxyuridine (Holliday, J., 

30 and Williams, M.V. [38]); acyclovir [9-([2- 

hydroxyet hoxy j methyl ) guanine ; e.g., Zovirax (Burroughs 
Wellcome)]; penciclovir (5 - [4 - hydroxy - 2 - 

(hydroxymethyl ) butyl j -guanine ) ; ganciclovir [(9- [I, 3- 
dihydroxy-2 propoxymethyl 3 -guanine ) e.g., Cymevene , 

35 Cytovene (Syntex) , DKPG (Stals ez al . [89]]; 

isopropylether derivatives of ganciclovir (see, e.g., 
Winkelmann ez al . [94]); cygalovir famciclovir [2- 
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these agents are preferentially phosphorylated by 
viral thymidine kinase (TK) , if one is present, ar.a/cr 
have higher affinity for viral DNA polymerase tnan rcr 
the cellular DNA polymerases, resulting in selective 
5 antiviral activity. Where a nucleoside analogue is 
incorporated into the viral DNA.. viral activity or 
reproduction may be affected in a variety of ways. 
For examole, the analogue may act as a chain 
terminator, cause increased lability (e.g., 
10 susceptibility to breakage) of analogue -containing 
DNA. and/or impair the ability of the substituted DNA 
to act as template for transcription or replication 
(see, e.g., Balzarini et al . [11])- 

15 it will be known to one of skill that, like many 

drugs, many of the agents useful for treatment of 
herpes virus infections are modified (i.e., 
"activated") by the host, host cell, or virus- infected 
host cell metabolic enzymes. For example, acyclovir 

20 is triphosphoryiated to its active form, with the 

first phosphorylation being carried but by the herpes 
virus thymidine kinase, when present. Other examples 
are the reported conversion of the compound HOE 6 02 to 
ganciclovir in a three-step metabolic pathway (Winkler 

25 et al. [95]) and the phosphorylation of ganciclovir to 

its active form, by. e.g.. a CMV nucleotide kinase. It 
will be awaren: to one cf skill that the specific 
metabolic capabilities of a virus can affect the 
sensitivity of that virus to specific drugs, anc^is 
3 0 one factor ir. the choice of an antiviral arug . Tne 

_ ■ ~* r-e>r- a - ^ 3--- - herpesvirus aoer.ts 
mechanism or acuon cer.a^ a 

is discussed in De Ciercq ^ j a — — u 

~ - i n ~ w h ; ch are ir.ccrocracea 

cited supra ana ir^ra, a-- 

by reference herein. 
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viral titer or bind to viral products. Antiviral 
agents are effective if they inactivate the virus, 
otherwise inhibit its infectivity or multiplication, 
or alleviate the symptoms of KS . 

5 

A. Antiviral Agents. 

The antiherpesvirus agents that will be useful for 
treating virus- induced KS can be grouped into broad 

10 classes based on their presumed modes of action. 

These classes include agents that act (i) by 
inhibition of viral DNA polymerase, (ii) by targeting 
other viral enzymes and proteins, (iii) by 
miscellaneous or incompletely understood mechanisms, 

15 or (iv) by binding a target nucleic acid (i.e., 

inhibitory nucleic acid therapeutics) . Antiviral 
agents may also be used in combination (i.e., together 
or sequentially) to achieve synergistic or additive 
effects or other benefits. 

20 

Although it is convenient to group antiviral agents by 
their supposed mechanism of action, the applicants do 
not intend to be bound by any particular mechanism of 
antiviral action. Moreover, it will be understood by 
25 those of skill that an agent may act on more than one 

target in a virus or virus - infect ed cell or through 
more than one mechanism. 



30 



i) Inhibitors of viral DNA polymerase 



Many antiherpesvirus agents in clinical use or in 
development today are nucleoside analogs believed to 
act through inhibition of viral DNA replication, 
especially through inhibition of viral DNA polymerase . 
3 5 These nucleoside analogs act as alternative substrates 

for the viral DNA polymerase or as competitive 
inhibitors cf DNA doI vmerase substrates. Usually 
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Burrouahs Wellcome Co.). Combinations of TS- 

inhibitors and viral TK- inhibitors in ar.-.iMrpe.i. 
medicines are disclosed in U.S. Pat- =,1,7,, 24. 
assigned to Stichting Rega VZW. A synergistic 
5 inhibitory effect on EBV replication using certain 
ratios of combinations of HPMPC with AZT was repcrtec 
by Lin et aJ. [56] . 

U.S. Patent Nos . 5,164,395 and 5,021,437 (Biumenkopf ; 
10 Burroughs Wellcome) describe the use of a 
ribonucleotide reductase inhibitor (an acetyipyname 
derivative) for treatment of herpes infections, 
including the use of the acetylpyridine derivative m 
combination with acyclovir. U.S. Patent No. 5,137,724 
15- (Balzari et al . [11]) describes the use of thymilydate 
synthase inhibitors (e.g., 5-f luoro-uracil and 5- 
fluro-2'-deoxyuridine) in combination with compounds 
having viral thymidine kinase inhibiting activity. 

20 With the discovery of a disease causal agent for KS 

now identified, effective therapeutic or prophaiactic 
protocols to alleviate or prevent the symptoms of 
heroes virus-associated KS can be formulated. Due to 
th-" viral nature of the disease, antiviral agents have 
25 application here for treatment, such as interferons, 

nucleoside analogues. ribavirin. amantadine, ana 
pvroohosohate analogues of phosphonoacetic acid 
(foscarnet) (reviewed in Gorbach, S.L., st a-, 
and the like. Immunological therapy will also oe 
3 0 e'f-tive in many cases to manage ana alleviate 

symptoms caused by the disease agents described here. 
Antiviral agents include agents or compositions tnat 

- _ _ ii-o-^are With 

directly bind to viral proau_-s a.— 

disease progress; and, excludes agents that do not 
35 impact directly or. viral multiplication cr vir.i 

Antiviral agents do not mciuae 



imnunoreguia:ory agents zhaz n=: =- re > 
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This invention provides a method for treating a 
subject with Kaposi's sarcoma (KS) comprising 
administering to the subject having a human 
herpesvirus -associated KS a pharmaceutically effective 
amount of an antiviral agent in a pharmaceutically 
acceptable carrier, wherein the agent is effective to 
treat the subject with KS-associated human herpes 
virus . 



Further, this invention provides a method of 
prophylaxis or treatment for Kaposi's sarcoma (KS) by 
administering to a patient at risk for KS , an antibody 
that binds to the human herpesvirus in a 
15 pharmaceutically acceptable carrier. In one 

embodiment the antiviral drug is used to treat a 
subject with the DNA herpesvirus of the subject 
invention . 

20 The use of combinations of antiviral drugs and 

sequential treatments are useful for treatment of 
herpesvirus infections and will also be useful for the 
treatment of herpesvirus - induced KS . For example, 
Snoeck et al . [86], found additive or synergistic 

25 effects against CMV when combining am i herpes drugs 

(e.g., combinations of zidovudine [ 3 ' - azido- 3 ' - 
deoxythymidine , AZT] with K?M?C, ganciclovir, 
foscarnet or acyclovir or of HPMPC with other 
antivirals) . Similarly, in treatment: of 

30 cytomegalovirus retinitis, induction with ganciclovir 

followed by maintenance with foscarnet has been 
suggested as a way to maximize efficacy while 
minimizing the adverse side effects of either 
treatment alone. An ant i - herpet i c composition that 

35 contains acyclovir and, e.g., 2 -acetyipyr idine - 5 - ( ( 2 - 

pyridylamino ) thiocarbonyl ) - thiocarbonohydrazone is 
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intervals and thawed onto 3-aminopropyltriethoxysi,ane 
created slides and allowed to air dry. The slices are 
then be fixed in 4% freshly prepared paraf ormaidenyae , 
rinsed in water. Formalin-fixed, paraffin embedded KS 
tissues cut at 6 ,m and baked onto glass slides can 
also be used. The sections are then deparaf f inized in 
xylenes and rehydrated through graded alcohols. 
Prehybridization in 20mM Tris Ph 7.5, 0.02% Denhardt • s 
solution, 10% dextran sulfate for 30 min at 37°C is 
followed by hybridization overnight in a solution of 
50% formamide (v/v) , 10% dextran sulfate (w/v) , 20mM 
sodium phosphate (Ph 7.4), 3X SSC, IX Denhardt • s 
solution, 100 ug/ml salmon sperm DNA, 125 ug/ml yeast 
tRNA and the oligo probe (lO'cpm/ml) at 42°C overnight. 
The slides are washed twice with 2X SSC and twice with 
IX SSC for 15 minutes each at room temperature and 
visualized by autoradiography. Briefly, sections are 
dehydrated through graded alcohols containing 0.3M 
ammonium acetate and air dried. The slides are dipped 
in Kodak NTB2 emulsion, exposed for days to weeKS , 
developed, and counterstained with hematoxylin and 
eoxin. Alternative immunohistochemical protocols may 
be employed which are known to those skilled in the 
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IV. T^atmer.t nf humar h-rpesv- rr«- < nducea KS 

This invention provides a method of treating a subject 
with Kaposi's sarcoma, comprising administering to the 
subject an effective amount of the antisense molecule 
caoable of hybridizing to the isolated DNA molecule 
under conditions such that the antiser.se mo.ecuie 
selectively enters a tumor cell of the subject, so as 
to treat the subject. 
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to a solid support, typically a glass slide. The 
cells are then contacted with a hybridization solution 
at a moderate temperature to permit annealing of 
target-specific probes that are labelled. The probes 
5 are preferably labelled with radioisotopes or 

fluorescent reporters. 

The above described probes are also useful for in- situ 
hybridization or in order to locate tissues which 

10 express this gene, or for other hybridization assays 

for the presence of this gene or its MRNA in various 
biological tissues. In-situ hybridization is a 
sensitive localization method which is not dependent 
on expression of antigens or native vs. denatured 

15 conditions. 

Oligonucleotide (oligo) probes, synthetic 

oligonucleotide probes or riboprobes made from KSHV 
phagemids/plasmids , are relatively homogeneous 

20 reagents and successful hybridization conditions in 

tissue sections is readily transferable from one probe 
to another. Commercially synthesized oligonucleotide 
probes are prepared against the identified genes. 
These probes are chosen for length {45-65 mers), high 

25 G-C content (50-70%) and are screened for uniqueness 

against other viral sequences in GenBank . 

Oligonucleotides are 3 ' end- labeled with [or-- s S]dATP to 
specific activities in the range of 1 x 10 1: dpm/ug 
30 using terminal deoxynucleot idyl transferase. 

Unincorporated labeled nucleotides are removed from 
the oligo probe by centrif ugat ion through a Sephadex 
G - 2 5 column or by elution from a Waters Sep ?ak C- 18 
column . 



35 



KS tissue embedded in OCT compound and snap frozen in 
freezing isopentane cooled with dry ice is cut at 6 urn 
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easily be distinguished by one of s klli :ror ' ° 
specific signal. Two fold signal over background is 
acceptable . 

5 A preferred method for detecting the KS-associateo 
herpesvirus is the use of PCR and/or dot blot 
hybridization. The presence or absence of an KS agent 
for detection or prognosis, or risk assessment fcr KS 
includes Southern transfers, solution hybridization or 

10 non-radioactive detection systems, all of which are 

well known to those of skill in the art. 
Hybridization is carried out using probes. 
Visualization of the hybridized portions allows the 
qualitative determination of the presence or absence 

15 of the causal agent. 

Similarly, a Northern transfer may be used for the 
detection of message in samples of RNA or reverse 
transcriptase PCR and cDNA can be detected by methods 
20 described above. This procedure is also well known in 

the art. See [81] incorporated by reference herein. 

An alternative means for determining the presence of 
the human herpesvirus is in situ hybridization, or 
25 more recently, in situ polymerase chain reaction. In 

sit:u PCR is described in Neuvo et al . [71], 
intracellular localization of polymerase chain 
reaction (PCR) -amplified Hepatitis C cDNA; Bagasra et 
al. [10], Detection of Human Immunodeficiency virus 

30 type 1 provirus in mononuclear cells by in situ 

Doiymerase chain reaction; and Henifcrd e: al . [3d], 
Variation in cellular SGr receptor mRNA expression 
demonstrated by in situ reverse transcriptase 
polymerase chain reaction. In situ hybridization 

35 assays are well known and are generally described m 

Methods Enzvmol. [67] incorporated by reference 
herein. In an in situ hybridization, cells are fixed 
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prepared from one or more KS- associated human 
herpesviruses of the invention. Briefly, to identify 
a target specific probe DNA is isolated from the 
virus . Test DNA either viral or cellular is 

5 transferred to a solid (e.g., charged nylon) matrix. 

The probes are labelled following conventional 
methods- Following denaturation and/or 

prehybridization steps known in the art, the probe is 
hybridized to the immobilized DNAs under stringent 

10 conditions. Stringent hybridization conditions will 
depend on the probe used and can be estimated from the 
calculated T m (melting temperature) of the hybridized 
probe {see, e.g., Sambrook for a description of 
calculation of the T m ) . For radioact ively- labeled DNA 

15 or RNA probes an example of stringent hybridization 

conditions is hybridization in a solution containing 
denatured probe and 5x SSC at 65°C for 8-24 hours 
followed by washes in 0 . lx SSC, 0.1% SDS (sodium 
dodecyl sulfate) at 50-65°C. In general, .the 

20 temperature and salt concentration are chosen so that 

the post hybridization wash occurs at a temperature 
that is about 5°C below the T P of the hybrid. Thus for 
a particular salt concentration the temperature may be 
selected that is 5°C below the T M or conversely, for a 

25 particular temperature, the salt concentration is 

chosen to provide a T y for the hybrid that is 5°C 
warmer than the wash temperature. Following stringent 
hybridization and washing, a probe that hybridizes to 
the KS-associated viral DNA but not to the non-KS 

30 associated viral DNA, as evidenced by the presence of 

a signal associated with the appropriate target and 
the absence of a signal from the non-target nucleic 
acids, is identified as specific for the KS associated 
virus. It is further appreciated that in determining 

35 probe specificity and in utilizing the method of this 

invention to detect KS-associated herpesvirus, a 
certain amount cf background signal is typical and can 
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A probe can be identified as capable of hybridizing 
specifically to its target nucleic acid by hybridize 
the probe to a sample treated according the protocol 
of this invention where the sample contains both 
5 target virus and animal cells (e.g., nerve cells). A 
probe is specific if the probe's characteristic signal 
is associated with the herpesvirus DNA in the sample 
and not generally with the DNA of the host. cells and 
non-biological materials (e.g., substrate) in a 
10 sample. 

The following stringent hybridization and washing 
conditions will be adequate to distinguish a specific 
probe (e.g., a f luorescently labeled DNA probe) from 
n 5 a probe that is not specific: incubation of the probe 

with the sample for 12 hours at 37°C in a solution 
containing denatured probe, 50% formamide, 2X SSC, and 
0.1% (w/v) dextran sulfate, followed by washing in IX 
SSC at 70°C for 5 minutes; 2X SSC at 37°C for 5 
20 minutes; 0 . 2X SSC at room temperature for 5 minutes^ 

and H,0 at room temperature for 5 minutes. Those of 
skill 'will be aware that it will often be advantageous 
in nucleic acid hybridizations (i.e., in situ, 
Southern, or other) to include detergents (e.g.. 
25 sodium dodecyl sulfate), chelating agents (e.g., EDTA) 

or other reagents (e.g., buffers, Denhardt ' s solution, 
dextran sulfate! in the hybridization or wash 
solutions. To test the specificity of the virus 
soecific orobes. the probes can be tested on host 
30 cells containing the KS - associated .herpesvirus ana 

compared with the results from ceils containing non- 
KS-associated virus. 



35 
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It will be apparent to those of ordinary 
ar: that a convenient met no a a 1 ~ 



a orobe is spec: z i : 



for a PCS - associated vi 



viral nutieic 



acid utilizes a Southern blot (cr Dot blot) using DNA 
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may be longer (e.g., at least about 50 or 100 bases in 
length) . Often the probe will be more than about 10C 
bases in length. . For example, when probe is prepared 
by nick-translation of DNA in the presence cf labeled 
5 nucleotides the average probe length may be about 100- 

600 bases. 

As noted above, the probe will be capable of specific 
hybridization to a specific KS-associated herpes virus 

10 nucleic acid. Such "specific hybridization" occurs 

when a probe hybridizes to a target nucleic acid, as 
evidenced by a detectable signal, under conditions in 
which the probe does not hybridize to other nucleic 
acids (e.g., animal cell or other bacterial nucleic 

15 acids) present in the sample. A variety of factors 

including the length and base composition of the 
probe, the extent of base mismatching between the 
probe and the target nucleic acid, the presence of 
salt and organic solvents, probe concentration, and 

20 the temperature affect hybridization, and optimal 

hybridization conditions must often be determined 
empirically. For discussions of nucleic acid probe 
design and annealing conditions, see, for example, 
[81], supra, Ausubel, F., et al . [8] [hereinafter 

25 referred to as Sambrook] , Methods in Enzyme logy [67] 

or Hybridization with Nucleic Acid Probes [42] all of 
which are incorporated herein by reference. 

Usually, at least a part of the probe will have 
30 considerable sequence identity with the 'target nuclei c 

acid. Although the extent of the sequence identity 
re era l red for specific hybridization will depend on the 
length of the probe and the hybridization conditions, 
the probe will usually have at least 70% identity to 
35 the target nucleic acid, more usually at least 80% 

identity, still more usually at least 90% identity and 
most usually at least 95% cr 100% identity. 
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Target specific probes may be used in the nucleic a::a 
hybridization diagnostic assays for KS . The prcoes 
are specific for or complementary to the target cf 
interest. For precise allelic differentiations, the 
5 probes should be about 14 nucleotides long and 

preferably about 20-30 nucleotides. For more general 
detection of the human herpesvirus cf the invention, 
nucleic acid probes are about 50 to about 1000 
nucleotides, most preferably about 200 to about 400 
10 nucleotides. 

A sequence is "specific" for a target organism of 
.interest if it includes a nucleic acid sequence which 
when detected is determinative of the presence of the 
organism in the presence of a heterogeneous population 
of proteins and other biologies. A specific nucleic 
acid probe is targeted to that portion of the sequence 
which is determinative of the organism and will not 
hybridize to other sequences especially those of the 
host where a pathogen is being detected. 



15 
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The specific nucleic acid probe can be RNA or DNA 
polynucleotide or oligonucleotide, or their analogs. 
The probes may be single or double stranded 
nucleotides. The probes of tne invention may be 
synthesized enzymaticaliy , using methods well known in 
the art (e.g., nick translation, primer extension, 
reverse transcription, the polymerase chain reaction, 
and others) or chemically (e.g., by methods such as 
-he phosphoramidite method described. by Beaucage and 
Carruthers [19]. or by the tnester method according 
to Matteucci, et al . [62], both incorporated herein by 
reference) . 

_ c .,f£^ipr' i orcrh to be able to 
35 The prone must oe o^ s^~ ieL - 

form a stable duplex with its targe- nucieic acia 
the sample, i.e., at least about 14 nucleotides, and 
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patients via oral, intravenous or parenteral 
administrations and other systemic forms. Those c: 
skill in the art will understand appropriate 
administration protocol for the individual 
compositions to be employed by the physician. 

The pharmaceutical formulations or compositions of 
th<s invention may be in the dosage form cf solid, 
semi-solid, of liquid such as, e.g., suspensions, 
aerosols or the like. Preferably the compositions are 
administered in unit dosage forms suitable for single 
administration of precise dosage amounts. The 
compositions may also include, depending on the 
formulation desired, pharmaceutically-acceptable , non- 
15 toxic carriers or diluents, which are defined as 
vehicles commonly used to formulate pharmaceutical 
compositions for animal or human administration. The 
diluent is selected so as not to affect the biological 
activity of the combination. Examples of such 
diluents are distilled water, physiological saline. 
Ringer's solution, dextrose solution, and Hank's 
solution. In addition, the pharmaceutical composition 
or formulation may also include other carriers, 
adjuvants; or nontoxic, nontherapeutic , nonimmunogenic 
stabilizers and the like. Effective amounts of such 
diluent or carrier are those amounts which are 
effective to obtain a pharmaceutical^ acceptable 
formulation in terms cf solubility of components, or 
biological activity, etc. 



20 
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V. 



TirnnunclQg^al Ao--r,aches -r Therapy... 



Having identified a primary causal agent or KS m 
humans as a novel human herpesvirus. tnere are 
immunosuppressive therapies that can mediate .ne 
immunolocic dysfunction that arises from tne presence 
, = ir. particular, agents tnat 
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block the immunological attack of the viral infected 
cells will ameliorate the symptoms of KS and/or reduce 
the disease progress. Such therapies include 

antibodies that specifically block the targeting of 
5 viral infected cells. Such agents include antibodies 

which bind to cytokines that upregulate the immune 
system to target viral infected cells. 

The antibody may be administered to a patient either 

10 singly or in a cocktail containing two or more 

antibodies, other therapeutic agents, compositions, or 
the like, including, but not limited to, immuno- 
suppressive agents, potentiators and side-effect re- 
lieving agents. Of particular interest are immuno- 

15 suppressive agents useful in suppressing allergic re- 

actions of a host. Immunosuppressive agents of inter- 
est include prednisone, prednisolone, DECADRON {Merck, 
Sharp Sc Dohme, West Point, PA) , cyclophosphamide, 
cyclosporine , 6 -mercaptopurine , methotrexate, 

20 azathioprine and i.v. gamma globulin or their 

combination. Potentiators of interest include 

monensin, ammonium chloride and chioroguine. All of 
these agents are administered in generally accepted 
efficacious dose ranges such as those disclosed in the 

25 Physician Desk Reference, 41st Ed. (1987), Publisher 

Edward R. Barnhart , New Jersey. 

Immune globulin from persons previously infected with 
human herpesviruses or related viruses can be obtained 

30 using standard techniques. Appropriate titers of 

antibodies are known for this therapy and are readily 
applied to the treatment of KS . Immune globulin can 
be administered via parenteral injection or by 
intrathecal shunt. In brief, immune globulin 

3 5 preparations may be obtained from individual donors 

who are screened for antibodies to the KS - associated 
human herpesvirus, and plasmas from high-titered 
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donors are pooled. Alternatively, plasmas from donors 
are pooled and then tested for antibodies to the human 
herpesvirus of the invention; high-titered pools are 
then selected for use in KS patients. 

Antibodies may be formulated into an injectable 
preparation. Parenteral formulations are known and 
are suitable for use in the invention, preferably for 
i.m. or i.v. administration. The formulations 
containing therapeutically effective amounts of 
antibodies or immunotoxins are. either sterile liquid 
solutions, liquid suspensions or lyophiiized versions 
and optionally contain stabilizers or excipients . 
Lyophiiized compositions are reconstituted with 
suitable diluents, e.g., water for injection, saline, 
0.3% glycine and the like, at a level of about from 
.01 mg/kg of host body weight to 10 mg/kg where 
appropriate . Typically, the pharmaceutical 

compositions containing the antibodies or immunotoxins 
will be administered in a therapeutically effective 
dose m a range of from about .01 mg/kg to about 5 
mg/kg of the treated mammal. A prererred 

therapeutically effective dose of the pharmaceutical 
composition containing antibody cr immunotoxin will be 
in a range of from about 0.01 mg/kg to about 0.5 mg/kg 
body weight of the treated mammal administered over 
several days to two weeks by daily intravenous 
infusion, each given over a one hour period, in a 
sequential patient dose-escalation regimen. 

Antibody may be administered systeir.icaliy by infection 
i.m., subcutaneous ly cr intraperitoneal^' or directly 
into KS lesions. The dose will be dependent upon the 
properties of the antibody cr immunotoxin employed, 
e.g., its activity and biological half -lire, the 
concentration of antibody in the f crmulat ion , tne site 
and rate of dosage, the clinical tolerance cr tne 
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patient involved, the disease afflicting the patient 
and the like as is well within the skill of the 
physician . 

5 The antibody of the present invention may be 

administered in solution. The pH of the solution 
should be in the range of pH 5 to 9.5, preferably pH 
6.5 to 7.5. The antibody or derivatives thereof 
should be in a solution having a suitable 

10 pharmaceutical^ acceptable buffer such as phosphate, 

tris (hydroxymethyl) aminomethane -HC1 or citrate and 
the like. Buffer concentrations should be in the 
range of 1 to 100 mM. The solution of antibody may 
. also contain a salt, such as sodium chloride or 

15 potassium chloride in a concentration of 50 to 150 mM . 

An effective amount of a stabilizing agent such as an 
albumin, a globulin, a gelatin, a protamine or a salt 
of protamine may also be included and may be added to 
a solution containing antibody or immunotoxin or to 

20 the composition from which the solution is prepared. 

Systemic administration of antibody is made daily, 
generally by intramuscular injection, although 
intravascular infusion is acceptable. Administration 
25 may also be' intranasal or by ether nonparenteral 

routes. Antibody or immunotoxin may also be 

administered via microspheres, liposomes or other 
microparticulate delivery systems placed in certain 
tissues including blood. 

30 

In therapeutic applications, the dosages cf compounds 
used in accordance with the invention vary depending 
on the class of compound and the condition being 
treated. The age, weight, and clinical condition of 
3 5 the recipient patient; ana the experience and judgment 

cf the clinician or practitioner administering the 
therapy are among the factors affecting the selected 



WO 96/15779 



PCT/i:S95/15138 



77 

dosage. For example, the dosage of an immunoglobulin 
can range from about 0.1 milligram per kilogram c: 
body weight per day to about 10 mg/kg per day for 
polyclonal antibodies and about 5% to about 20% oi 
5 that amount for monoclonal antibodies. In such a 

case, the immunoglobulin can be administered once 
daily as an intravenous infusion. Preferably, the 
dosage is repeated daily until either a therapeutic 
result is achieved or until side effects warrant 
10 discontinuation of therapy. Generally, the dose 

should be sufficient to treat or ameliorate symptoms 
or signs of KS without producing unacceptable toxicity 
. to the patient . 

15 An effective amount of the compound is that which 

provides either subjective relief of a symptom (s) or 
an objectively identifiable improvement as noted by 
the clinician or other qualified observer. The dosing 
range varies with the compound used, the route of 
administration and the potency of the particular 



20 



compound . 

VI . Vaccines and Prophyla xis for KS 



a 



25 This invention provides a method of vaccinating 

subject against Kaposi's sarcoma, comprising 
administering to the subject an effective amount of 
the peptide or polypeptide encoded by the isolated DNA 
molecule, and a suitable acceptable carrier, thereby 

30 vaccinating the subject. In one embodiment naked DNA 

is administering to the subject in an effective amount 
to vaccinate a subject against Kaposi's sarcoma. 

This invention provides a method of immunizing a 
35 subject against a disease caused by the DNA 

herpesvirus associated with Kaposi's sarcoma which 
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comprises administering to the subject an effective 
immunizing dose of .the isolated herpesvirus vaccine . 

A. Vaccines 

5 

The invention also provides substances suitable for 
use as vaccines for the prevention of KS and methods 
for administering them. The vaccines are directed 
against the human herpesvirus of the invention, and 
10 most preferably comprise antigen obtained from the KS - 
associated human herpesvirus. 

Vaccines can be made recombinant ly . Typically, a 
vaccine will include from about 1 to about 50 

15 micrograms of antigen or antigenic protein or peptide. 

More preferably, the amount of protein is from about 
15 to about 45 micrograms. Typically, the vaccine is 
formulated so that a dose includes about 0.5 
milliliters. The vaccine may be administered by any 

20 route known in the art. Preferably, the route is 

parenteral. More preferably, it is subcutaneous or 
intramuscular . 

There are a number of strategies for amplifying an 
25 antigen's effectiveness, particularly as related to 

the art of vaccines. For example, cyclization or 
cir cularizat ion of a peptide can increase the 
peptide's antigenic and immunogenic potency. See U.S. 
Pat. No. 5,001,049 which is incorporated by reference 
3 0 herein. More conventionally, an antigen can be 

conjugated to a suitable carrier, usually a protein 
molecule. This procedure has several facets. It can 
allow multiple copies of an antigen, such as a 
peptide, to be conjugated to a single larger carrier 
35 molecule. Additionally, the carrier may possess 

properties which facilitate transport, binding, 
absorption cr transfer of the antiaen. 
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For parenteral administration, such as subcutaneous 
injection, examples of suitable carriers are tne 
tetanus toxoid, the diphtheria toxoid, serum albumin 
and lamprey, or keyhole limpet, hemocyanin because 
they provide the resultant conjugate with minimum 
genetic restriction. Conjugates including these 
universal carriers can function as T ceil clone 
activators in individuals having very different gene 
sets . 



The conjugation between a peptide and a carrier can be 
accomplished using one of the methods known in the 
art. Specifically, the conjugation can use 

bifunctional cross - linkers as binding agents as 
15 detailed, for example, by Means and Feeney, "A recent 

review of protein modification techniques," 
Bioconjugate Chew. 1:2-12 (1990). 

Vaccines against a number of the Herpesviruses have 
20 been successfully developed. Vaccines against 

Varicella-Zoster Virus using a live attenuated Oka 
strain is effective in preventing herpes zoster in the 
elderly, and in preventing chickenpox in both 
immunocompromised and normal children (Hardy, I., ez 
23 al. [30]; Hardy, I. ec ai . [31]; Levin, M.J. ez al . 

[54]; Gershcn, A. A. [26]. Vaccines against Herpes 
simplex Types 1 and 2 are also commercially available 
with some success in protection against primary 
disease, but have been less successful in preventing 
30 the establishment of latent infection in^ sensory 

ganglia (Roizman, B. [783; Skinner, G.R. ez al . [57]}. 

Vaccines against the human herpesvirus can be made by 
isolating extracellular viral particles rrom mrectec 
25 cell cultures, inactivating the virus witn 
formaldehyde followed by ul tracer.: rir ugat ion -c 
r.a— :des and remove the 
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formaldehyde, and immunizing individuals with 2 cr 3 
doses containing l x io 5 virus particles (Skinner, G.f. . 
et al . [86]). Alternatively, envelope glycoproteins 
can be expressed in E . coli or transfected into stable 
mammalian cell lines, the proteins can be purified and 
used for vaccination (Lasky, L.A. [53] ) . MKC 
binding peptides from cells infected with the human 
herpesvirus can be identified for vaccine candidates 
per the methodology of [61], supra. 



The antigen may be combined or mixed with various 
solutions and other compounds as is known in the art . 
For example, it may be administered in water, saline 
or buffered vehicles with or without various adjuvants 

15 or immunodi luting agents. Examples of such adjuvants 

or agents include aluminum hydroxide, aluminum 
phosphate, aluminum potassium sulfate (alum), 
beryllium sulfate, silica, kaolin, carbon, water-in- 
oil emulsions, oil-in-water emulsions, muramyl 

20 dipeptide, bacterial endotoxin, lipid X, 

Corynebacterium parvum { Propionibacterium acnes) , 
Bordetella pertussis, polyribonucleotides, sodium 
alginate, lanolin, lysolecithin , vitamin A, saponin, 
liposomes, levamisole, DEAE-dextran , blocked 

25 copolymers or other synthetic adjuvants. Such 

adjuvants are available commercially from various 
sources, for example, Merck Adjuvant 65 (Merck and 
Company, Inc., Rahway, N.J.) or Freund's Incomplete 
Adjuvant and Complete Adjuvant (Difco Laboratories, 

30 Detroit, Michigan) . Other suitable' adjuvants are 

Amphigen (oil-in-water), Alhydrogel (aluminum 
hydroxide), or a mixture of Amphigen and Alhydrogel. 
Only aluminum is approved for human use. 

35 The proportion of antigen and adjuvant can be varied 

over a broad range so long as both are present in 
effective amounts. For examole, aluminum hvdrcxide 
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can be present in an amount of about 0.5% or tne 
vaccine mixture (Al : 0, basis). Or. a per-dose basis, 
che amount of the antigen can range from about. 0.1 ng 
to about 100 M9 protein per patient. A preferable 
5 range is from about 1 fig to about 50 M9 per dose. A 
more preferred range is about 15 .pg to about 4 5 fig. 
A suitable dose size is about 0.5 ml. Accordingly, a 
dose for intramuscular injection, for example, would 
comorise 0.5 ml containing 45 M9 of antigen in 
10 admixture with 0.5% aluminum hydroxide. After 
formulation, the vaccine may be incorporated into a 
sterile container which is then sealed and stored at 
low temperature, for example 4'C, or it may be 
reeze-dried. Lyophilization permits long-term 

15 storage in a stabilized form. 

The vaccines may be administered by any conventional 
method for the administration of vaccines including 
oral and parenteral (e.g., subcutaneous or 
intramuscular) injection. intramuscular 

administration is preferred. The treatment may 
consist of a single dose of vaccine or a plurality of 
doses over a period of time. It is preferred that the 
dose be given to a human patient within tne first B 
25 months of life. The antigen of the invention can be 

combined with appropriate doses of compounds ir.clua.ing 
influenza antigens, such as influenza type A antigens. 
Also, the antigen could be a component of a 
recombinant vaccine which could be adaptable for oral 
30 administration. 

Vaccines of the invention may be combined with other 
vaccines for other diseases to produce multivalent 
vaccines. A pharmaceut ically effective amount of tne 
35 antigen can be employed with a pharmaceut icai-y 

acceptable carrier such as a protein cr diluent useiui 
for the vaccination of mamma is , particu-s-*/ 
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Other vaccines may be prepared according to methods 
well-known to those skilled in the art. 

Those of skill will readily recognize that it is only 
5 necessary to expose a mammal to appropriate epitopes 

in order to elicit effective immunoprotection . The 
epitopes are typically segments of amino acids which 
are a small portion of the whole protein. Using 
recombinant genetics, it is routine to alter a natural 

10 protein's primary structure to create derivatives 

embracing epitopes that are identical to or 
substantially the same as (immunologically equivalent 
to) the naturally occurring epitopes. Such 
derivatives may include peptide fragments, amino acid 

15 substitutions, amino acid deletions and amino acid 

additions of the amino acid sequence for the viral 
proteins from the human herpesvirus. For example, it 
is known in the protein art that certain amino acid 
residues can be substituted with amino acids of 

20 similar size and polarity without an undue effect upon 

the biological activity of the protein. The human 
herpesvirus proteins have significant tertiary 
structure and the epitopes are usually conformational. 
Thus, modifications should generally preserve 

25 conformation to produce a protective immune response. 

3. Antibody Prophylaxis 

Therapeutic, intravenous, polyclonal or monoclonal 
30 antibodies can been used as a mode of passive 

immunotherapy of herpesviral diseases including 
perinatal varicella and CMV . Immune globulin from 
persons previously infected with the human herpesvirus 
and bearing a suitably high titer of antibodies 
35 against the virus can be given in combination with 

antiviral agents (e.g. ganciclovir), or in combination 
with other modes of immunotherapy that are currently 
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being evaluated for the treatment of KS , which are 
targeted to modulating the immune response (i.e. 
treatment with copolymer- 1, ant iidiotypic monoclonal 
antibodies, T cell "vaccination"). Antibodies to 
human herpesvirus can be administered to the patient 
as described herein. Antibodies specific for an 
eDitope expressed on cells infected with the human 
herpesvirus are preferred and can be obtained as 
described above. 

A polypeptide, analog or active fragment can be 
formulated into the therapeutic composition as 
neutralized pharmaceutical^ acceptable salt forms. 
Pharmaceutical^ acceptable salts include the acid 
15 addition salts (formed with the free amino groups of 

the polypeptide or antibody molecule) and which are 
formed with inorganic acids such as, for example, 
hydrochloric or phosphoric acids, or such organic 
acids as acetic, oxalic, tartaric, mandelic, and the 
20 like. Salts formed from the free carboxyl groups can 

also be derived from inorganic bases such as, for 
example, sodium, potassium, ammonium, calcium, or 
ferric hydroxides, and such organic bases as 
isopropylamine, trimethylamine , 2-ethylamino ethanol, 
25 histidine, procaine, and the like. 

C. Monitoring therapeutic efficacy 

This invention provides a method for monitoring the 

_ ^ ^ . r.-r - '-oflr.m 07 *!: for Kaoosi's 
30 therapeutic ef-icacy o_ , a^m-.^. 

sarcoma, which comprises determining in a first sample 

from a subject with Kaposi's sarcoma the presence of 

the isolated DNA molecule, administering to the 

subject a therapeutic amount of an agent such that tne 

35 agent is contacted to the cell ir. a sample, 

determining after a suitable period of time the amount 

c<= the isolated DNA molecule in the second sample from 
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the created subject, and comparing the amount cf 
isolated DNA molecule determined in the first sample 
with the amount determined in the second sample, a 
difference indicating the effectiveness of the agent, 
5 thereby monitoring the therapeutic efficacy of 

treatment for Kaposi's sarcoma. As defined herein 
"amount" is viral load or copy number. Methods of 
determining viral load or copy number are known to 
those skilled in the art. 

10 

VII . Screening Ass ays For Pharmaceutical Agents of 
* Interest in Al leviating the Symptoms of KS . 

Since an agent involved in the causation or 
15 progression of KS has been identified and described 

here, assays directed to identifying potential 
pharmaceutical agents that inhibit the biological 
activity of the agent are possible, KS drug screening 
assays which determine whether or not a drug has 
20 activity against the virus described herein are 

contemplated in this invention. Such assays comprise 
incubating a compound to be evaluated for use in KS 
treatment with cells which express the KS associated 
human herpesvirus proteins or peptides and determining 
25 therefrom the effect of the compound on the activity 

of such agent. In vitro assays in which the virus is 
maintained in suitable ceil culture are preferred, 
though in vivo animal models would also be effective. 

30 Compounds with activity against the agent of interest 

or peptides from such agent can be screened in in 
vitro as well as in vivo assay systems . I_n v: t r o 
assays include infecting peripheral blood leukocytes 
or susceptible T cell lines such as MT-4 with the 

35 agent of interest in the presence cf varying 

concentrations of compounds targeted against viral 
replication, including nucleoside analogs, chair. 
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terminators, antisense oligonucleotides and random 
polypeptides (Asada, K. ez al . [7]; Kikuta ez al . [4S]^ 
both incorporated by reference herein) . Infected 
cultures and their supernatants can be assayed for the 
5 total amount of virus including the presence of the 

viral genome by quantitative PCR, by dot blot assays, 
or by using immunologic methods. For example, a 
culture of susceptible cells could be infected with 
the human herpesvirus in the presence of various 
10 concentrations of drug, fixed on slides after a period 

of days, and examined for viral antigen by indirect 
immunofluorescence with monoclonal antibodies to viral 
peptides ( [48] , supra. Alternatively, chemically 
adhered MT-4 cell monolayers can be used for an 
15 infectious agent assay using indirect 

immunofluorescent antibody staining to search for 
focus reduction (Higashi, K . et al . [36], incorporated 
by reference herein) . 

20 As an alternative to whole cell in vitro assays, 

purified enzymes isolated from the human herpesvirus 
"can be used as targets for rational drug design to 
determine the effect of the potential drug on enzyme 
activity, such as thymidine phosphotransferase or DNA 

25 polymerase. ■ The genes for these two enzymes are 

provided herein. A measure of enzyme activity 
indicates effect on the agent itself. 

Drug screens using herpes viral products are known and 
have been previously described in EP.0514E30 (herpes 
nroteases) and WO 94 /04 S20 (U. 13 gene product}. 
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This invention provides an assay for screening anti-KS 
chemotherapeutics. Infected cells can be incubated in 
35 the presence of a chemical agent that as a potential 

chemotherapeutic against KS (e.g. acycio- guanosine ! . 
The level of virus in the ceils is then determined 
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after several days by IFA for antigens or Southern 
blotting for viral genome or Northern blotting :cr 
MRNA and compared to control cells. This assay can 
quickly screen large numbers of chemical compounds 
5 that may be useful against KS . 

Further, this invention provides an assay system that 
is employed to identify drugs or other molecules 
capable of binding to the DNA molecule or proteins, 

10 either in the cytoplasm or in the nucleus, thereby 

inhibiting or potentiating transcriptional activity. 
Such assay would be useful in the development of drugs 
that would be specific against particular cellular 
activity, or that would potentiate such activity, in 

15 time or in level of activity. 

This invention is further illustrated in the 
Experimental Details section which follows. This 
section is set forth to aid in an understanding of the 
20 invention but is not intended to, and should not be 

construed to, limit in any way the invention as set 
forth in the claims which follow thereafter. 
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EXPERIMENTAL DETAILS SECTION I : 

Experiment 1 : Representational difference analysis 

(RDA) to identify and characterize 
unique DNA sequences in KS tissue 

30 To search for foreign DNA sequences belonging to an 

infectious agent: in AIDS - KS , representational 
difference analysis (RDA) was employed to identify and 
characterize unique DNA sequences in KS tissue that 
are either absent or present in low copy number in 

3 5 non-diseased tissue obtained from the same patient 

[58] . This method can detect adenovirus genome added 
in single copy tc human DNA but has not been used to 
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identify previously uncultured infectious agents, 
is performed by making simplified "representations" c: 
genomes from diseased and normal tissues from the same 
individual through PCR amplification of short 
restriction fragments. The DNA representation from 
rhe diseased tissue is then ligated to a priming 
sequence and hybridized to an excess of uniigated, 
normal tissue DNA representation. Only unique 

sequences found in the diseased tissue have priming 
sequences on both DNA strands and are preferentially 
amplified during subsequent rounds of PCR 
amplification. This process can be repeated using 
different ligated priming sequences to enrich the 
sample for unique DNA sequences that are only found in 
15 the tissue of interest. 

DNA (10 fig) extracted from both the KS lesion and 
unaffected tissue were separately digested to 
completion with Bam KI (20 units/M9> at 37° C for 2 
hours and 2 M g of digestion fragments were ligated to 
NBaml2 and N3am24 priming sequences [primer sequences 
described in 58] . Thirty cycles of PCR amplification 
were performed to amplify "representations" of both 
aenomes. After construction of the genomic 

representations, KS tester amplicons between 150 and 
1500 bp were isolated from an agarose gel and N3am 
priming sequences were removed by digestion with Bam 
KI. To search for unique DNA sequences not found in 
non-KS driver DNA, a second set of priming sequences 
30 (JBaml2 and J3am24) was ligated onto only the KS 

rester DNA amplicons (Figure 1, lane 1). 0.2 ^g of 
ligated KS lesion amplicons were hybridized to 20 ^g 
of unligated, normal tissue representational 
amo! icons. An aliquot of the hybridization product 
was then subjected to 10 cycles of ?CR amplification 
using J3am24 . followed by mung bear. nuclease 
diaestion. An aliquot cf the mung bean-treateo 
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difference product was then subjected to 15 more 
cycles of PCR with the J3am24 primer (Figure 1, lane 

2) . Amplification products were redigested with Bam 
HI and 200 ng of the digested product was ligated to 

5 RBaml2 and RBam24 primer sets for a second round of 

hybridization and PCR amplification (Figure l, lane 

3 ) . This enrichment procedure was repeated a third 
time using the JBam primer set (Figure 1, lane 4) . 
Both the original driver and the tester DNA samples 

10 (Table 2, Patient A) were subsequently found to 

contain the AIDS-KS specific sequences KS330Bam and 
KS631Bam (previously identified as KS627Bam) 
indicating that RDA can be successfully employed when 
the target sequences are present in unequal copy 

15 number in both tissues. 



The initial round of DNA amplification-hybridization 
from KS and normal tissue resulted in a diffuse 
banding pattern (Figure 1, lane 2), but four bands at 

20 approximately 380, 450, 540 and 680 bp were 

identifiable after the second amplification- 
hybridization (Figure l, lane 3). These bands became 
discrete after a third round of amplification - 
hybridization (Figure 1, lane 4) . Control RDA, 

25 performed by hybridizing DNA extracted from AIDS-KS 

tissue against itself, produced a single band at 
approximately 54 0 bp (Figure 1, lane 5). The four KS- 
associated bands (designated KS330Bam, KS390Bam, 
KS480Bam, KS627Bam after digestion of the two flanking 

30 28 bp ligated priming sequences with Sam KI ) were gel 

purified and cloned by insertion into the pCRII 
vector. PCR products were cloned in the pCR" vector 
using the TA cloning system (Invitrogen Corporation, 
San Diego, CA) . 



35 
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F^iment 2± Determination of the specificity of 

AIDS-KS unique sequences. 

To determine the specificity of these sequences for 
5 AIDS-KS, random-primed "P-labeled inserts were 

hybridized to Southern blots of DNA extracted from 
cryopreserved tissues obtained from patients with and 
without AIDS. All AIDS-KS specimens were examined 
microscopically for morphologic confirmation of KS and 
10 immunohistochemically for Factor VIII, Ulex europaeus 
and CD34 antigen expression. One of the AIDS-KS 
specimens was apparently mislabeled since KS tissue 
was not detected on microscopic examination but was 
included in the KS specimen group for purposes of 
15 statistical analysis. Control tissues used for 
comparison to the KS lesions included 56 lymphomas 
from patients with and without AIDS, 19 hyperplastic 
lymph nodes from patients with and without AIDS, 5 
vascular tumors from nonAIDS patients and 13 tissues 
20 infected with opportunistic infections that commonly 

occur in AIDS patients. Control DNA was also 
extracted from a consecutive series of 49 surgical 
biopsy specimens from patients without AIDS. 
Additional clinical and demographic information on the 
25 specimens was not collected to preserve patient 

confidentiality. 

The tissues, listed in Table I, were collected from 
diagnostic biopsies and autopsies between 1983 and 

. -7no<~ 'T-«-i=s i10 samcle was irom 

30 19S3 and storea a: -70°^. — • -ss- .i>a^ 

a different patient, except as noted in Table 1. Most 

of the 27 KS specimens were from lymph nodes dissected 

under surgical conditions which diminishes possible 

contamination with normal skin flora. All specimens 

35 were digested with Bam KI prior to hyoridi zat ion . 
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KS3 90Bam and KS4 80Bam hybridized nonspecif ically to 
both KS and non-KS tissues and were not further 
characterized. 20 of 27 (74%) AIDS-KS DNAs hybridized 
with variable intensity to both KS33 0Bam and KS62 7Bam, 
and one additional KS specimen hybridized only to 
KS627Bam by Southern blotting {Figure 2 and Table 1) . 
In contrast to AIDS-KS lesions, only 6 of 3 9 (15%) 
non-KS tissues from patients with AIDS hybridized to 
the KS330Bam and KS627Bam inserts {Table 1) . 



Specific hybridization did not occur with lymphoma or 
lymph node DNA from 3 6 persons without AIDS or with 
control DNA from 4 9 tissue biopsy specimens obtained 
from a consecutive series of patients. DNA extracted 

15 from several vascular tumors, including a 

hemangiopericytoma, two angiosarcomas and a 
lymphangioma, were also negative by Southern blot 
hybridization. DNA extracted from tissues with 
opportunistic infections common to AIDS patients, 

20 including 7 acid-fast bacillus (undetermined species) , 

1 cytomegalovirus, 1 cat -scratch bacillus, 2 
cryptococcus and 1 toxoplasmosis infected tissues, 
were negative by Southern blot hybridization to 
KS330Bam and KS627Bam (Table 1) . 

25 
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Table l. Southern blot hybridization ror K ? ^Oaa., 

KS627Bam and PCR amplification ror ^^k: 
in human tissues from individual patients 



Tissue 



AIDS-KS 
AIDS 

lymphomas 
AIDS 

lymph nodes 

Non-AIDS 
Lymphomas 

Non-AIDS 
lymph nodes 

Vascular 
tumors 



n KS3 3 0Bam Southern KS62 7Bam Southern KS3 3C : , 4 

hybridization n(%? hvhridizacicn ntr PCR positive 



27* 

271 

12 
29 



4§ 



20 (74) 
3 (11) 

3 (25) 

0 (0) 

0 (0) 

0 (0) 



Opportunistic 1311 0 (0) 
infections 

Consecutive 49*** 0 (0) 
suraical biopsies 



21 (78) 

3 (11) 

3 (25) 

0 (0) 

0 (0) 

0 (0) 

0 (0) 

0 (0) 



25 (93) 

3 (11) 

3 (25) 

0 (0) 

0 (0) 

0 (0) 

0 (0) 

0 (0) 
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Legend to Table 1 : 

♦Includes one AIDS-KS specimen unamplif iable fcr p53 
exon 6 and one tissue which on microscopic examination 
5 did not have any detectable KS nissue present. Both 

of these samples were negative by Southern blot 
hybridization to KS330Bam and KS627Bam and by PCR 
amplification for the KS330 234 amplicon. 

10 t Includes 7 small non-cleaved cell lymphomas, 20 

diffuse large cell and immunoblastic lymphomas. Three 
of the lymphomas with immunoblastic morphology were 
positive for KS330Bam and KS627Bam. 

15 * Includes 13 anaplastic large cell lymphomas, 4 

diffuse large cell lymphomas, 4 small lymphocytic 
lymphomas /chronic lymphocytic leukemias, 3 hairy cell 
leukemias, 2 monocytoid B-cell lymphomas, 1 follicular 
small cleaved cell lymphoma, 1 Burkitt's lymphoma, 1 

20 plasmacytoma. 

§ Includes 2 angiosarcomas, 1 hemangiopericytoma and 
1 lymphangioma. 

25 IT Includes 2 cryptococcus, 1 toxoplasmosis, 1 cat - 

scratch bacillus, l cytomegalovirus, I Epstein-Barr 
virus, and 7 acid-fast bacillus infected tissues. In 
addition, pure cultures cf Mycobacterium avium- complex 
were negative by Southern hybridization and PCR, and 
pure cultures of Mycoplasma penetrans were negative by 
PCR. 



30 



€ i t Tissues included skin, appendix, kidney, prcstate, 
hernia sac, lung, fibrous tissue, gallbladder, color. , 
3 5 foreskin, thyroid, small bowel, adenoid, vein, 

axillary tissue, lipoma, heart, mouth, hemorrhoid, 
pseudoaneurysm and fistula track. Tissues were 
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collected from a consecutive series of biopsies or. 
patients without AIDS but with unknown HIV serostatus . 

**Apparent nonspecific hybridization at approximately 
2 0 ~Kb occurred in 4 consecutive surgical biopsy DNA 
samples: one colon and one hernia sac DNA sample 
hybridized to KS330Bam alone, another hernia sac DNA 
sample hybridized to KS627Bam alone and one appendix 
DNA sample hybridized to both KS330Bam and KS627Bam. 
These samples did not hybridize in the 330-630 bp 
range expected for these sequences and were PCR 
negative for KS330 234 . 
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In addition, DNA from Epstein-Barr virus - infected 
peripheral blood lymphocytes and pure cultures cf 
Mycobacterium avium-complex were also negative by 
Southern hybridization. Overall, 2 0 of 2 7 (74%) AIDS- 
5 KS specimens hybridized to KS3 3 0Bam and 21 of 27 (78%) 

AIDS-KS specimens hybridized to KS627Bam, compared to 
only 6 of 142 (4%) non-KS human DNA control specimens 
(X 2 = 85.02, p< icr 7 and x 2 =92.4, p< 10' 1 respectively). 

10 The sequence copy number in the AIDS-KS tissues was 

estimated by simultaneous hybridization with KS330Bam 
and a 44 0 bp probe for the constant region of the T 
cell receptor 0 gene [76] . Samples in lanes 5 and 6 
of Figures 2A-2B showed similar intensities for the 

15 two probes indicating an average copy number of 

approximately two KS330Bam sequences per cell, while 
remaining tissues had weaker hybridization signals for 
the KS330Bam probe. 

20 Experiment 3 : Characterization of KS330Bam and 

KS627Bam 

To further characterize KS330Bam and KS6273am, six 
clones for each insert were sequenced. The Seauenase 

25 version 2.0 (United States Biochemical, Cleveland, OH) 

system was used and sequencing was performed according 
to manufacturer's instructions. Nucleotides sequences 
were confirmed with an Applied 3iosystems 373A 
Sequencer in the DNA Sequencing Facilities at Columbia 

30 University. 

K£3 3 OBam is a 330 bp sequence with 51% G:C content 
(Figure 3B) and KS627Bam is a 627 bp sequence with a 
63% G:C content {Figure 3C) . KS330Bam has 54% 
3 5 nucleotide identity to the 3DLF1 open reading frame 

(ORF) cf Epstein-3arr virus (E3V) . Further analysis 
revealed that both KS3 3 0Bam and KS6273arr. code for 
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amino acid sequences with homology to pclypeptiaes c: 
viral origin- SwissProt and FIR protein databases 
were searched for homologous ORF using BIASTX [3] . 

5 KS330Bam is 51% identical by amino acid homology to a 

portion of the ORF26 open reading frame encoding the 
capsid protein VP23 <NCBI g.i. 60348, bp 46024 - 
46935) of herpesvirus saimiri [2] , a gammaherpes virus 
which causes fulminant lymphoma in New world monkeys . 
10 This fragment also has a 39% identical ammo acid 
sequence to the theoretical protein encoded by the 
homologous open reading frame BDLF1 in EBV (NCBI g.i. 
.59140, bp 132403 -133307) [9], The amino acid 
sequence encoded by KS627Bam is homologous with weaker 
15 identity (31%) to the tegument protein, gpl40 (ORF 29, 

NCBI g.i. 60396, b P 108782 - 112681 ) of herpesvirus 
saimiri . 



20 



Sequence data from KS330Bam was used to construct PCR 
primers to amplify a 234bp fragment designated KS330 23 < 
"(Fioure 33) . The conditions for PCR analyses were as 
follows: 94°C for 2 mm (1 cycle); 94°C for 1 min, 
58°C for 1 min, 72°C for 1 min (35 cycles); 72°C 
extension for 5 min (1 cycle). Each PCR reaction used 
25 0.1 Mg of genomic DNA , 50 pmoles of each primer, 1 

unit of Taq polymerase, 100 M M of each 

deoxynucleotide triphosphate, 50 mM KCl , 10mM Tris-HCi 
(P K 9.0), and 0.1% Triton-X-100 in a final volume of 
25 Ml- Amplifications were carried out in a Perkin- 
Eimer 480^ Thermocycier with 1-s ramp times between 
stens . 



30 



Althouah Southern blot hybridization aetectec tne 
K3330Bam sequence in only 20 of 27 KS tissues, 25 of 
35 the 27 tissues were positive by PCR ampiiticaticn ^cr 

KS330 23 , {Figures 4A-43) demonstrating that KS 3 3 03am is 
present in some KS lesions at levels below the 
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threshold for detection by Southern blct 
hybridization. All KS3 3C 234 PCR products hybridized to 
a 32 P end-labelled 25 bp internal oligomer, confirming 
the specificity of the PCR (Figure 4B) . Of the two 
5 AIDS-KS specimens negative for KS33 0 234 , both specimens 

appeared to be negative for technical reasons : one 
had no microscopically detectable KS tissue in the 
frozen sample (Figures 4A-4B, lane 3), and the other 
(Figures 4A-4B, lane 15) was negative in the control 

10 PCR amplification for the p53 gene indicating either 

DNA degradation or the presence of PCR inhibitors in 
the sample. PCR amplification of the p53 tumor 
suppressor gene was used as a control for DNA quality. 
Sequences of p53 primers from P6-5, 5 ' - 

15 ACAGGGCTGGTTGCCCAGGGT-3 ' (SEQ ID No: 44); and P6-3 . 5'- 

AGTTGCAAACCAGACCTCAG- 3 ' (SEQ ID NO: 45) [25] . 

Except for the 6 control samples from AIDS patients 
that were also positive by Southern blot 

2 0 hybridization, none of the other 13 6 control specimens 

were positive by PCR for KS330 234 . Ail cf these 
specimens were amplif iable for the p53 gene, 
indicating that inadequate PCR amplification was not 
the reason for lack of detection of KS33C I3 . in the 

25 control tissues. Samples containing DNA from two 

candidate KS agents, EBV and Mycoplasma penetrans 
(ATCC Accession No. 5 5252) , a pathogen commonly found 
in the genital tract of patients wi-h AZDS-KS [59] 
were also negative for amplification of KS3 3 C 23 < . In 

30 addition, several KS specimens were' tested using 

commercial PCR primers (Stratagene, La Jolla, CA) 
specific for mycoplasma*: a and primers specific for the 
EBNA-2, E3NA-3C and E3ER regions cf EBV and were 
negative [57] . 

35 

Overall, DNA from 2 5 (93%) cf 27 AIDS-KS tissues were 
positive by PCR compared wi: h DNA from 6 (4%) of 14 2 
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control tissues, including 6 (15%) of 39 non-KS^ lymph 
nodes and lymphomas from AIDS patients (x" = 3E.2, ? < 
10' 6 ), 0 of 36 lymph nodes and lymphomas from nonAIDS 
patients (x-55.2, p < 10^) and 0 of 49 consecutive 
biopsy specimens ( X : =67.7, p < 10" 7 ) . Thus, KS330 = „ was 
found in all 25 amplifiabie tissues with 
microscopically detectable AIDS-KS, but rarely 
occurred in non-KS tissues, including tissues from 
AIDS patients. 



Of the six control tissues from AIDS patients that 
were positive by both PCR and Southern hybridization, 
two patients had KS elsewhere, two did not develop KS 
and complete clinical histories for the remaining two 
patients were unobtainable. Three of the six positive 
non-KS tissues were lymph nodes with follicular 
hyperplasia taken from patients with AIDS. Given the 
high prevalence of KS among patients with AIDS, it is 
possible that undetected microscopic foci of KS were 
2 0 present in these lymph nodes. The other three 

positive tissue specimens were 3 cell immunobiast ic 
lymphomas from AIDS patients. It is possible that the 
putative KS agent is also a cofactor for a subset of 
AIDS-associated lymphomas [16, 17, 60] . 

25 

To determine whether KS3303am and KS6273am are 
portions of a larger genome and to determine the 
proximity of the two sequences to each other, samples 
of KS DNA were digested with Pvu 11 restriction 

30 enzymes. Digested genomic DN* ~- ~ 

samples were hybridized to KS3 3 0Bam and KS 5 2 73am by 

. , These sscuences 

Southern clearing i-'-3'— 9 = ' • 

hybridized to various sized fragments cf the digested 
KS DNA indicating that both sequences are fragments or 
larger genomes. Differences ir. the KS3303arr. 

hybridization pattern to Pvu " digests of the three 

, ^ _; _ . ~ ^. ~ ' v r^~ i — -^x^ r. isms ma v 

IDS-KS specimens ir.c^a.e - na - - fc —- - 



WO 96/15779 



PCT/US95/15138 



98 

occur in the larger genome. Individual fragments frcrr. 
the digests failed to simultaneously hybridize with 
both KS330Bam and KS627Bam, demonstrating that these 
two Bam HI restriction fragments are not adjacent to 
5 one another . 

If KS3 3 0Bam and KS627Bam are heritable polymorphic DNA 
markers for KS, these sequences should be uniformly 
detected at non-KS tissue sites in patients with AIDS- 

10 KS . Alternatively, if KS330Bam and KS627Bam are 

sequences specific for an exogenous infectious agent, 
it is likely that some tissues are uninfected and lack 
detectable KS330Bam and KS627Bam sequences. DNA 
extracted from multiple uninvolved tissues from three 

15 patients with AIDS-KS were hybridized to 3i P- labelled 

KS3 3 0Bam and KS627Bam probes as well as analyzed by 
PCR using the KS3 30 234 primers (Table 2) . While KS 
lesion DNA samples were positive for both bands, 
unaffected tissues were frequently negative for these 

20 sequences. KS lesions from patients A, B and C, and 

uninvolved skin and muscle from patient A were 
positive for KS330Bam and KS627Bam, but muscle and 
brain tissue from patient B and muscle, brain, colon, 
heart and hilar lymph node tissues from patient C were 

25 negative for these sequences. Uninvolved stomach 

lining adjacent to the KS lesion in patient C was 
positive by PCR, but negative by Southern blotting 
which suggests the presence of the sequences in this 
tissue at levels below the detection threshold for 

30 Southern blot t inq . 
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Table 2- Differential detection of KS330Bam KS627Bam 
'' and KS330 234 sequences in KS- involved and 

non- involved tissues from three patients 

with AIDS-KS. 





KS3 3 0Bam 


KS627Bam 


KS330,,, 


Patient A 








KS, skin 


+ 


4- 




nl skin 


+ 


•4- 




nl muscle 




o- 




Patient B 








KS, skin 




.4* 


-4- 


nl muscle 








nl brain 








Patient C 








KS , stomach 




-4- 




nl stomach 
adiacent to KS 








nl muscle 








nl brain 








nl colon 








nl heart 








nl hilar lymph 
nodes 
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Pr ppriment 4 : Subcloning and sequencing of KSHV 

KS330Bam and KS627Bam are genomic fragments of a novel 
infectious agent associated with AIDS-KS.. A genomic 
library from a KS lesion was made and a phage clone 
with a' 20 kb insert containing the KS330Bam sequence 
was identified. The 20 kb clone digested with PvuII 
(which cuts in the middle of the KS3 3GBam sequence) 
produced 1.1 kb and 3 kb fragments that hybridized to 
KS3303am. The 1.1 kb subcioned insert and -500 bp 
from the 3 kb subcioned insert resulting ir. 34 04 bp 01 
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contiguous sequence was entirely sequenced. This 
sequence contains partial and complete open reading 
frames homologous to regions in gamma herpesviruses . 

5 The KS330Bam sequence is an internal portion of an 918 

bp ORF with 55-56% nucleotide identity to the ORF26 
and BDLF1 genes of HSVSA and EBV respectively. The 
EBV and HSVSA translated amino acid sequences for 
these ORFs demonstrate extensive homology with the 

10 amino acid sequence encoded by the KS-associated 918 

bp ORF (Figure 6) . In HSVSA, the VP23 protein is a 
late structural protein involved in capsid' 
construction. Reverse transcriptase (RT) -PCR of mRNA 
from a KS lesion is positive for transcribed KS330Bam 

15 mRNA and that indicates that this ORF is transcribed 

in KS lesions. Additional evidence for homology 
between the KS agent and herpesviruses comes from a 
comparison of the genomic organization of other 
potential ORFs on the 9404 bp sequence (Figure 3A) 

2 0 The 5' terminus of the sequence is composed 

nucleotides having 66-67% nucleotide identity and 68- 
71% amino acid identity to corresponding regions of 
the major capsid protein (MCP) ORFs for both EBV and 
HSVSA. This putative MCP ORF of the KS agent lies 

25 immediately 5' to the BDLF1/ORF26 homolog which is a 

conserved orientation among herpesvirus subfamilies 
for these two genes. At the 3' end of this sequence, 
the reading frame has strong amino acid and nucleotide 
homology to HSVSA ORF 27. Thus, KS-associated DNA 

30 sequences at four loci in two separate regions with 

homologies to gamma herpesviral genomes have been 
identified. 

In addition to fragments obtained from Pvu II digest 
35 of the 21 Kb phage insert described above, fragments 

obtained from a BamHI /Not I digest were also subcloned 
into pBluescript (Stratagene, La Joiia, CA) . The 
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termini of these subcioned fragments were sequenced 
and were also found no be homologous to nucleic acic 
sequence EBV and HSVSA genes. These homologs have 
been used to develop a preliminary map of subcionec 
5 fragments (Figure 9) . Thus, sequencing has reveaiea 

that the KS agent maintains co-. linear homology to 
gamma herpesviruses over the length of the 21 Kb phage 
insert . 

10 Prppriment 5: Determination of the phylogeny of KSHV 

Regions flanking KS330Bam were sequenced and 
characterized by directional walking. This was 
performed by the following strategy: 1) KS genomic 
libraries were made and screened using the KS330Bam 
fragment as a hybridization probe, 2) DNA inserts from 
phage clones positive for the KS330Bam probe were 
isolated and digested with suitable restriction 
enzyme(s), 3) the digested fragments were subcioned 
into pBluescript (Stratagene, La Jolia, CA} , and 4) 
the subclones were sequenced. Using this strategy, 
the manor capsid protein (MC?) OR? homoiog was the 
first important gene locus identified. Using 
sequenced unique 3' and 5 ' end- fragments from positive 
25 phage clones as probes, and following the strategy 

above a KS genomic library are screened by standard 
methods for additional contiguous sequences. 

For sequencing purposes, restriction fragments are 
30 subcioned into p ha gem id pBluescript . KS* , p-iuescnpt 

KS-, pBS^, or pBS- (Stratagene; or into plasmid pUClB 
or pUClS. Recombinant DNA was purified througn CsCl 
density gradients or by anion- exchange chromatography 
(Qiagen) . 

3 5 

Nucleotide sequenced by standard screening methods of 
cloned fragments of KSKV were done by direcr 
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sequencing of double- stranded DNA using 
oligonucleotide primers synthesized commercially to 
"walk" along the fragments by the dideoxy-nucieot ide 
chain termination method. Junctions between clones 
5 are confirmed by sequencing overlapping clones. 

Targeted homologous genes in regions flanking KS3 3 0Bam 
include, but are not limited to: 11-10 homolog, 
thymidine kinase (TK) , g85, g35, gH, capsid proteins 

10 and MCP. TK is an early protein of the herpesviruses 

functionally linked to DNA replication and a target 
enzyme for anti-herpesviral nucleosides. TK 
phosphorylates acyclic nucleosides such as acyclovir 
which in turn inhibit viral DNA polymerase chain 

15 extension. Determining the sequence of this gene will 

aid in the prediction of chemotherapeutic agents 
useful against KSHV. TK is encoded by the E3V BXLF1 
ORF located -9700 bp rightward of BDLF1 and by the 
HSVSA ORF 21 -9200 bp rightward of the ORF 26. A 

20 subcloned fragment of KSS was identified with strong 

homology to the EBV and HSVSA TK open reading frames. 

g85 is a late glycoprotein involved in membrane fusion 
homologous to gH in HSV1 . In EBV, this protein is 
25 encoded by BLXF2 ORF located -7600 bp rightward of 

BDLF1 , and in HSVSA it is encoded by ORF 22 located 
-7100 bp rightward of ORF26. 

g35 is a late E3V glycoprotein found in virion and 
30 plasma membrane. It is encoded by 3DL.F3 ORF which is 

13 00 bp leftward of EDLFl in EBV. There is no BDLF3 
homolog in HSVSA. A subcloned fragment has already 
been identified with strong homology to the EBV gp3 5 
ooen readina frame. 



35 



Major capsid protein (MCP) is a conserved 150 KDa 
protein which is the major component cf herpesvirus 
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capsid. Antibodies are generated against the M" 
during natural infection with most herpesviruses. Tne 
terminal 1026 bp of this major capsid gene homoiog in 
KSHV have been sequenced. 

5 

Targeted homologous genes/loci in regions, flanking 
KS627Bam include, but are not limited to: terminal 
reiterated repeats, LMPI , EBERs and Ori P. Terminal 
reiterated sequences are present in all herpesviruses. 

10 In EBV , tandomly reiterated 0.5 Kb long terminal 

repeats flank the ends of the linear genome and become 
joined in the circular form. The terminal repeat 
region is immediately adjacent to BNRF1 in EBV and ORF 
75 in HSVSA. Since the number of terminal repeats 

15 varies between viral strains, identification of 

terminal repeat regions may allow typing and clonality 
studies of KSHV in KS legions. Sequencing through the 
terminal repeat region may determine whether this 
virus is integrated into human genome in KS . 

20 

LMPI is an latent protein important in the 
transforming effects of EBV in Burkitt's lymphoma. 
This gene is encoded by the EBV BNRF1 ORF located 
-2000 bp rightward of tegument protein ORF 3NRF1 in 
25 the circularized genome. There is no LMPI homoiog in 

HSVSA. 

EBERs are the most abundant RNA in latently EBV 
infected cells and Ori-P is the origin of replication 
30 for latent EBV genome. This region is. located between 

-4 0 00-9000 bp leftward of the 3NRF1 ORF m EBV; there 
are no corresponding regions in HSVSA. 

The data indicates that the KS agent is a new human 
35 herpesvirus related to gamma herpesviruses EBV and 

HSVSA. The results are not due to contamination cr to 
incidental co- infection with a known herpesvirus since 
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the sequences are distinct from all sequenced 
herpesviral genomes (including ESV, CMV, KKVc and 
HSVSA) and are associated specifically with KS in 
three separate comparative studies. Furthermore, PGR 
5 testing of KS DNA with primers specific fcr E3V-1 and 

EBV-2 failed to demonstrate these viral genomes in 
these tissues. Although KSHV is homologous to EBV 
regions, the sequence does not match any other known 
sequence and thus provides evidence for a new viral 
10 genome, related to but distinct from known members of 

the herpesvirus family. 

Experiment 6 : Serological studies 

15 Indirect immunofluorescence assay (IFA) 

Virus-containing cells are coated to a microscope 
slide. The slides are treated with organic fixatives, 
dried and then incubated with patient sera. 

20 Antibodies in the sera bind to the cells, and then 

excess nonspecific antibodies are washed off. An 
antihuman immunoglobulin linked to a f luorochrome , 
such as fluorescein, is then incubated with the 
slides, and then excess fluorescent immunoglobulin is 

25 washed off. The slides are then examined under a 

microscope and if the cells fluoresce, then this 
indicates that the sera contains antibodies directed 
against the antigens present in the cells, such as the 
virus . 

30 

An indirect immunofluorescence assay UFA) was 
performed on the Body Cavity-Based Lymphoma cell line 
(BCBL-1) , which is a naturally transformed E3V 
infected ( nonproducing ) 5 cell line, using 4 KS 
35 patient sera and 4 control sera (from AIDS patients 

without KS) . Initially, both sets cf sera showed 
similar levels of antibodv bindina. To remove 
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nonspecific antibodies directed against ESV and 
lymphocyte antigens, sera at 1:25 dilution were pre- 
adsorbed using 3xl0 € 1% paraf ormaldehyde - fixed Raji 
cells per ml of sera. BCBL1 cells were fixed with 
ethanol/acetone, incubated with dilutions of patient 
sera, washed and incubated with fluorescein- con jugated 
goat ant i -human IgG. Indirect immunof luorescent 
staining was determined. 

Table 3 shows that unabsorbed case and control sera 
have similar end-point dilution indirect 
immunofluorescence assay ( I FA) titers against the 
BCBL1 cell line. After Raji adsorption, case sera 
have four-fold higher I FA titers against 3C3L1 cells 
than control sera. Results indicated that pre- 
adsorption against paraf ormaldehyde- fixed Raji cells 
reduces fluorescent antibody binding in control sera 
but do not eliminate antibody binding to KS case sera. 
These results indicate that subjects with KS have 
specific antibodies directed against the KS agent that 
can be detected in serological assays such as IFA, 
Western blot and Enzyme immunoassays (Table 3). 
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Table 3 : Indirect immunofluorescence end -point titers 
for KS case and non-KS control sera against 
the BCBL-i cell line 

5 

Sera No . Status* Pre -adsorption ■' Post - adsorption * * 

1 KS > 1:400 > 1:400 

2 KS 1:100 1:100 

3 KS 1:200 1:100 
10 4 KS > 1 :400 1 : 200 

5 Control > 1:400 1:50 

6 Control 1:50 1:50 

7 Control 1:100 1:50 
15 8 Control 1:200 1:50 

Legend Table 3 : 

20 * KS=autopsy-conf irmed male, AIDS patient 

Control=autODSy- confirmed female, AIDS patient, 
no KS 

** Adsorbed against RAJ I cells treated with 1% 
25 paraformaldehyde 

I mmu n ob lotting f "Western blot"? 

Virus-containing cells or purified virus (or a portion 
30 of the virus, such as a fusion protein) is 

electrophoresed on a polyacrylamide gel to separate 
the protein antigens by molecular weight. The 
proteins are blotted onto a nitrocellulose or nylon 
membrane, then the membrane is incubated in patient 
35 sera. Antibodies directed against specific antigens 

are developed by incubating with a anti-human 
immunoglobulin attached to a reporter enzyme, such as 
a peroxidase. After developing the membrane, each 
antigen reacting against antibodies in patient sera 
4 0 shows up as a band on the membrane at the 

corresponding molecular weight region. 
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Enzvme immr nn.^av f'EIA or ELISA'H 

Virus-containing cells or purified virus (or a portion 
of the virus, such as a fusion protein) is coated to 
5 the bottom of a 96-well plate by various means 

(generally incubating in alkaline carbonate buffer) . 
The plates are washed, then the wells are incubated 
with patient sera. Antibodies in the sera directed 
against specific antigens stick on the plate. The 
10 wells are washed again to remove nonspecific antibody, 

then they are incubated with a antihuman 
immunoglobulin attached to a reporter enzyme, such as 
a peroxidase. The plate is washed again to remove 
nonspecific antibody and then developed. Wells 
15 containing antigen that is specifically recognized by 

antibodies in the patients sera change color and can 
be detected by an EL ISA plate reader (a 
spectrophotomer ) . 

20 All three of these methods can be made more specific 

by pre -incubating patient sera with uninfected ceils 
to adsorb out cross - reacting antibodies against the 
cells or against other viruses that may be present in 
the cell line, such as E3V . Cross- react ing antibodies 
25 can potentially give a falsely positive test result 

(i.e. the patient is actually not infected with the 
virus but has a positive test result because of cross - 
reacting antibodies directed against ceil antigens in 
the preparation) . The importance of the infection 
30 experiments with Ra j i is that ir Ra J 1 ceils, or 

another well-defined ceil line', car. be infected, then 
the patient's sera can be pre-adsorbed against the 
uninfected parental cell line and then tested in one 
of the assays. The only antibodies left in the sera 
35 after pre - adsorption that bind to antigens m tne 

Dreoaration should be directed against tne virus. 
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Experiment 7 : 

BCBL l, from lymphomatous tissues belonging to a rare 
5 infiltrating, anaplastic body cavity lymphoma 

occurring in AIDS patients has been placed in 
continuous cell culture and shown to be continuously 
infected with the KS agent. This cell line is also 
naturally infected with Epstein-Barr Virus (EBV) . The 

10 BCBL cell line was used as an antigen substrate to 

detect specific KS antibodies in persons infected with 
the putative virus by Western-blotting. Three 
lymphoid B cell lines were used as controls. These 
included the EBV genome positive cell line P3K3 , the 

15 EBV genome defective cell line Raji and the EBV genome 

negative cell line Bjab. 

Cells from late-log phase culture were washed 3 time 
with PBS by centrif ugation at 500 g for lOmin. .and 

20 suspended in sample buffer containing 50 mM Tris-HCl 

pH 6.8, 2% SDS (w/v) , 15% glycerol (v/v) , 5% 0- 
mercaptoethanol (v/v) and 0.001% bromophenol (w/v) 
with protease inhibitor, 100 phenylmethylsulf onyl 
fluoride (PMSF) . The sample was boiled an 1 00 °C for 

25 5 min and centrif uged at 14,000 g for 10 mm. ■ The 

proteins in the supernatant was then fractionated by 
sodium, dodecyl sulf ate -polyacrylamide gel 

electrophoresis (SDS-PAGE) under reducing conditions 
with a separation gel of 15% and a stacking gel of 5% 

30 (3) . Prestaxned protein standards were included: 

myosin, 200 kDa; 0-galactosidase , 118 kDA; 3SA, 78 
kDa; ovalbumin, 4 7.1 kDa ; carbonic anhydrase, 31.4 
kDa; soybean trypsin inhibitor, 25.5 kDa, lysozyme, 
18.8 kDa and aprotinin, 8.3 kDa (Sio-Rad) . 

3 5 Immunoblotting experiments were performed according to 

the method of Towbin et al . (4) . Briefly, the 
proteins were elect rophorect i cally transferred to 
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Hybon-C extra membranes (Pharmacia) a: 24 V rcr 7l- 
min. The membranes were then driea at 7 c rcr ^ 
min, saturated with 5% skim milk in Tris-buf f erec 
saline, pH 7 . 4 (TBS J containing 50 mM Tris-HCl and 20C 
mM NaCl, at room temperature for 1 h. The membranes 
were subsequently incubated with human sera at 
dilution 1:200 in 1% skim milk overnight at room 
temperature, washed 3 times with a solution containing 
TBS, 0.2% Triton X-100 and 0.05% skim milk and then 2 
times with TBS. The membranes were then incubated for 
2 h at room temperature with alkaline phosphatase 
conjugated goat anti-mouse igG + IgM + IgA (Sigma) 
diluted at 1:5000 in 1% skim milk. After repeating 
the washing, the membranes were stained with nitroblue 
tetranolium chloride and 5 -bromo-4 - chloro- 3 - 
indolylphosphate p-toluidine salt (Gibco BRL) . 



Two bands of approximately 226 kDa and 234 kDa were 
identified to be specifically present on the Wester- 
20 blot of BCBL cell lysate in 5 sera from AIDS gay man 

patients infected with KS . These 2 bands were absent 
from the lysates of P3H3 , Ra j i and Bjab cell lysates . 
5 sera from AIDS gay man patients without KS and 2 
sera from AIDS woman patients without KS as well as 1 

25 sera from nasopharyncel carcinoma patient were not 

able to detect these 2 bands in BCBL 1, P3K3. Raji and 
Bjab cell lysates. In a blinded experiment, using the 
226 kDa and 234 kDa markers, 15 our cf 16 sera from KS 
patients were correctly identified. In total, the 

3 0 226 kDa and 234 kDa markers were detected in 2C out of 

21 sera from KS patients. 

The antigen is enriched in the nuclei fraction of 
BC3L1. Enriched antigen with low background can De 
35 obtained by preparing nucleic from BCBC as -he 

starting antigen preparation using stanoarc, widely 
available protocols. For example, 500-750x1 cr BCBL 
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at 5X10 5 cells/ml can be pelleted at low speed. The 
pellet is placed in 10 mM NaCl, 10 mM Tris pH 7.6, 1.5 
mM MgCl 2 (equi volume) + 1.0% NP-40 on ice for 20 min 
to lyse cells. The lysate is then spun at 1500 rpm 
5 for 10 min. to pellet nucleic. The pellet is used as 

the starting fraction for the antigen preparation for 
the Western blot. This will reduce cross- reactive 
cytoplasmic antigens. 

10 Experiment 8 : Transmission studies 

Co- infection experiments 

BCBL1 cells were co-cultivated with Raji cell lines 

15 separated by a 0.45 fi tissue filter insert. 

Approximately, 1-2 x 10 6 BCBL1 and 2x10 Raji cells 
were co- cult ivated for 2-20 days in supplemented RPM1 
alone, in 10 fig /ml 5 ' -bromodeoxyuridine (BUdR) and 0.6 
fig /ml 5 ' - f lourodeoxyuridine or 20 ng/ml 12-0- 

20 tetradecanoylphorbol - 13 -acetate (TPA) . After 2,8,12 

or 20 days co- cult ivation , Raji cells were removed, 
washed and placed in supplemented RPMI 164 0 media. A 
Raji culture co- cult ivated with BCBL1 in 20 ng/ml TPA 
for 2 days survived and has been kept in continuous 

25 suspension culture for >10 weeks. This cell line, 

designated RCC1 (Raji Co-Culture, No. 1) remains PCR 
positive for the KS330 23 . sequence after multiple 
passages. This cell line is identical to its parental 
Raji cell line by flow cytometry using EMA, 31, B4 and 

30 Be rH2 lymphocyte - flow cytometry (approximately 2%) . 

RCC1 periodically undergo rapid cytolysis suggestive 
of lytic reproduction of the agent. Thus, RCC1 is a 
Raji cell line newly infected with KSHV . 

35 The results indicate the presence of a new human 

virus, specifically a herpesvirus in KS lesions. The 
high degree of association between this agent and 
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AIDS-KS (>90%), and the low prevalence of the agent ir. 
non-K5 tissues from immunocompromised AIDS patients, 
indicates that this agent has a causal role in AIDS-KS 
■[47, 68] . 

Fv ppriTnpnt 10: Isolation of KSHV 



he 



Crude virus preparations are made from either 
supernatant or low speed pelleted cell fraction of 
BCBL1 cultures. Approximately 650ml or more cf log 
phase cells should be used (>5X10< cells/ml) . 

For bonding whole virion from supernatant, the cell 
free supernatant is spun at 10,000 rpm in a GSA rotor 
for 10 min to remove debris. PEG-B000 is added to 7%, 
dissolved and placed on ice for >2 . 5 hours. The PEG- 
supernatant is then spun at 10,000 xg for 30 min. 
supernatant is poured off and the pellet is dried and 
scraped together from the centrifuge bottles. The 
pellet is then resuspended in a small volume (1-2 ml) 
of virus buffer (VB, 0.1 K NaCl , 0.01 M Tns, pH 7.5) . 
This procedure will precipitate both naked genome and 
whole virion. The virion are then isolated by 
centrifugation at 25,000 rpm m a 10-50% sucrose 
25 aradient made with VB . One ml fractions cf the 

gradient are then obtained by standard techniques 
"(e.g. using a f ractionator } and each fraction is then 
tested by dot blotting using specific hybridizing 
primer sequences to determine the gradient fraction 
30 containing the purified virus (preparation or the 

fraction maybe needed in order to detect the presence 
cf the virus, such as standard OKA extraction; . 

To obtain the episomal ON A from the virus , tne pellet 

- _ - - ~ - - "hen Ivsed 

of cells is wasnea ana pe^e^e- 

using hypotonic shock and/or repeated cycles ^cr 
freezing and thawing in a sma_- vo* urp.tr 
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Nuclei and other cytoplasmic debris are removed by 
cent rif ligation at 10 , OOOg for 10 min , filtration 
through a 0.45 m filter and then repeat centri f ugat ion 
at 10, OOOg for 10 min. This crude preparation 
5 contains viral genome and soluble ceil components . 

The genome preparation can then be gently chloroform- 
phenol extracted to remove associated proteins or can 
be placed in neutral DNA buffer (1 M NaCl , 5 0 mM Tris , 
10 mM EDTA, pH 7.2-7.6) with 2% sodium dodecyisuif at e 

10 (SDS) and 1% sarcosyl . The genome is then banded by 

centrif ugation through 10-30% sucrose gradient in 
neutral DNA buffer containing 0.15% sarcosyl at 2 0,00 0 
rpm in a SW 27.1 rotor for 12 hours (for 40,000 rpm 
for 2-3 hours in an SW41 rotor) . The band is detected 

15 as described above. 

An example of the method for isolating KSHV genome 
from KSHV infected cell cultures (97 and 98) . 
Approximately 800 ml of BC3L.1 cells are pelleted, 

20 washed with saline, and pelleted by low speed 

centrif ugation . The cell pellet is lysed with an 
equal volume of RS3 (10 mM NaCl, 10 mM Tris-HCi, 1.5 
mM MgC12, pK 7.S) with 1% NP-40 on ice for 10 minutes. 
The iysate is centrifuged at 900xg for 10 minutes to 

25 pellet nuclei. This step is repeated. To the 

supernatant is added 0.4% sodium dodecyisul f at e and 
EDTA to a final concentration of 10 mM . The 
supernatant is loaded on a 10-3 0% sucrose gradient in 
1.0 M NaCl , lmM EDTA , 50mM Tris-HCl, pH 7.5. The 

30 gradients are centrifuged at 20,000 rpm on a SW 27.1 

rotor for 12 hours. In figure 11, 0 . 5 ml aliquots of 
the gradient have been fractionated (fractions 1-62) 
with the 30% gradient fraction being at fraction No. 
1 and the 10% gradient fraction being at fraction No. 

35 62. Each fraction has been dot hybridized to a 

nitrocellulose membrane and then a 3: ?- labeled KSHV DNA 
fragment, KS£ 3 IB am has been hybridized to the membrane 
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using standard techniques. Figure 11 shows that the 
major solubiiized fraction of the KSKV genome bands 
(i.e. is isolated) in fractions 42 through 46 of the 
gradient with a high concentration of the genome being 
5 present in fraction 44. A second band of solubiiized 

KSHV DNA occurs in fractions 26 through 32. 

py ppri ment 11: Purification of KSHV 

10 DNA is extracted using standard techniques from the 

RCC-1 or RCC-1 2FE cell line [27, 49, 66]. The DNA is 
tested for the presence of the KSKV by Southern 
blotting and PCR using the specific probes as 
described hereinafter. Fresh lymphoma tissue 

15 containing viable infected cells is simultaneously 

filtered to form a single cell suspension by standard 
techniques [49, 66]. The cells are separated by 
standard Ficoll - Plaque centrif ugation and lymphocyte 
layer is removed. The lymphocytes are then placed at 

20 >lxl0' : cells/ml into standard lymphocyte tissue culture 

medium, such as RMP 164 0 supplemented with 10% fetal 
calf serum. Immortalized lymphocytes containing the 
KSHV virus are indefinitely grown in the culture media 
while nonimmortilized cells die during course of 
25 prolonged cultivation. 

Further, the virus may be propagated in a new cell 
line by removing media supernatant containing the 
virus from a continuously infected ceil line at a 

_ ^ - _ - ^ - r(=i ~ ' c- / n-^' The media is 

30 concentration or >lx--~ ce_ -s/ * 

centrifuged at 20C0xg for 10 Minutes and riiterea 
through a 0.45u filter to remove ceils. The media is 
apolied in a 1:1 volume with ceils growing at >lxl0* 
cells/ml for 48 hours. The ceils are washed and 
35 pelleted and placed in fresh culture medium, and 

tested after 14 days of growth. 
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The herpesvirus may be isolated from the cell DNA in 
the following manner. An infected cell line, which 
can be lysed using standard methods such as hyposmotic 
shocking and Dounce homogenization , is first pelleted 
5 at 2000xg for 10 minutes, the supernatant is removed 

and centrifuged again at 10,000xg for 15 minutes to 
remove nuclei and organelles. The supernatant is 
filtered through a 0.45/i filter and centrifuge d aaain 
at 100,000xg for 1 hour to pellet the virus. The 
10 virus can then be washed and centrifuged again at 

10 0,000xg for 1 hour. 
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F.yPFRTMENTAt, DETAI7 - g gprTTON II: 

^.pnrinc studies: A lambda phage (KS5) from a KS 
lesion genomic library identified by positive 
5 hybridization with KS330Bam was digested with SairtKI 

and Not I (Boehringer-Mannheim, Indianapolis IN) ; five 
fragments were gel isolated and subcloned into 
Bluescript II KS {Stratagene, La Jolla CA) . The 
entire sequence was determined by bidirectional 
10 sequencing at a seven fold average redundancy by 

primer walking and nested deletions. 

DNA sequence data were compiled and aligned using 
ALIGN (IBI -Kodak, Rochester NY) and analyzed using the 

15 Wisconsin Sequence Analysis Package Version 8 -UNIX 

(Genetics Computer Group, Madison WI ) and the GRAIL 
Sequence Analysis, Gene Assembly and Sequence 
Comparison System v. 1.2 (Informatics Group, Oak Ridge 
TN) . Protein site motifs were identified using Motif 

20 (Genetics Computer Group, Madison WI ) . 

gonrces nf Herpesvirus Gene Sequence — Caparisons : 

Complete genomic sequences of three gammaherpes viruses 
were available: Epstein-Barr virus UBV) , a 

herpesvirus of humans [4]; herpesvirus saimiri (HVS) , 
a herpesvirus of the New World monkey Saimiri sciureus 
[1] ; and equine herpesvirus 2 (EKV2 [49] ) . Additional 
thymidine kinase gene sequences were obtained for 
alcelaphine herpesvirus 1 (AHVI [22]) and bovine 
herpesvirus 4 (3HV4 [31]). Sequences for the major 
capsid protein genes of human herpesvirus 65 and human 
herpesvirus 7 (KHV7) were from Mukai e: ai . '.34]. Tne 
sources of ail other sequences used are listed 
previously in McGeoch and Cook [31] and McGeoch et al . 
35 [32] . 
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Phylog enetic Inference; Predicted amino acid sequences 
used for tree construction were based on previous 
experience with herpesviral phylogenetic analyses 
[31] . Alignments of homologous sets of amino acid 
5 sequences were made with the AMPS [5] and Pileup [16} 

programs. Regions of alignments that showed extreme 
divergence with marked length heterogeneity, typically 
terminal sections, were excised. Generally, positions 
in alignments that contained inserted gaps in one or 
10 more sequences were removed before use for tree 

construction. Phylogenetic inference programs were 
from the Phylip set, version 3.5c [14] and from the 
GCG set [16] . Trees were built with the maximum 
parsimony (MP) , neighbor joining (NJ) methods. For 
15 the NJ method, which utilizes estimates of pairwise 

distances between sequences, distances were estimated 
as mean numbers of substitution events per site with 
Protdist using the PAM 250 substitution probability 
matrix of Schwartz & Dayhoff [46] . Bootstrap 
20 analysis [15] was carried out for MP and NJ trees, 

with 100 sub-replicates of each alignment, and 
consensus trees obtained with the program Consense. 
In addition the program Protml was used to infer trees 
by the maximum likelihood (ML) method. Protml was 
25 obtained form J. Adachi , Department of Statistical 

Science, The Graduate University for Advanced Study, 
Tokyo 106, Japan. Because of computational 

constraints, Protml was used only with the 4 -species 
CS1 alignment . 

30 

Clamped Homogeneous Electric Field 'CHEF' G e 1 
Electrophoresis : Agarose plugs were prepared by 
resuspending BC3L-1 cells in 1% LMP agarose (Biorad, 
Hercules CA) and C . 9% Nad at 42°C to a final 
35 concentration of 2.5 x 10~ cells/mi . Solidified 

agarose plugs were transferred into lysis buffer ( 0 . 5M 
EDTA pH 6.0, 1% sarcosyl, proteinase K at 1 mg/rr.l 
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final concentration) and incubated for 24 
Approximately 10" BC3L-I cells were loaded in each 
lane. Gels were run at a gradient of 6 . 0 V/cm with a 
run time of 2B h 28 min. on a CHE? Mapper XA pulsed 
field gel electrophoresis apparatus (Biorad, Hercules 
CA) , Southern blotted and hybridized to KS627Bam, 
KS3 3 0Bam and an EBV terminal repeat sequence [4 0] . 

tp& TnducHnn of Genome Replication: Late log phase 
BC3L-1 cells (5xl0 £ cells per ml) were incubated with 
varying amounts of 12 -O- tetradecanoylphorbol - 13 - 
acetate {TPA, Sigma Chemical Co., St. Louis MO) for 48 
h, cells were then harvested and washed with 
phosphate-buffered saline (PBS) and DNA was isolated 
by chloroform-phenol extraction. DNA concentrations 
were determined by UV absorbance; 5 fig of whole cell 
DNA was quantitatively dot blot hybridized in 
triplicate (Manifold I, Schleicher and Schuell, Keene 
NH) . KS631Bam, EBV terminal repeat and beta-actin 
20 sequences were random-primer labeled with J ~? M3] ■ 

Specific hybridization was quantitated on a Molecular 
Dynamics Phosphor Imager 425E. 



ro n culrnrps and Transmissi on Studies: Cells were 
maintained at 5x10 s cells per ml in RPMI 1640 with 20% 
fetal calf serum {FCS, Gibco-BRL, Gaithersburg MD) and 
periodically examined for continued KSKV infection by 
PGR and dot hybridization. The T cell line Molt-3 (a 
gift from Dr. Jodi Slack, Centers for Disease Control 
and Prevention), Raji ceils (American Type Culture 
Collection, Rockvilie MD ) and RCC-1 ceils were 
cultured in RPMI 164 0 with 10% PCS. Owl monkey kidney 
cells (American Type Culture Collection, Renville MD) 
were cultured in MEM with 10% FCS and 1% nonessential 
35 amino acids (Gibco-BRL, Gaithersburg MD) • 
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To produce the RCC-1 cell line, 2x10" Raj i cells were 
cultivated with 1 . 4x10' 3C3L-1 cells in the presence of 
2 0 ng/ml TPA for 2 days in chambers separated by 
Falcon 0.45 pg filter tissue culture inserts to 
prevent contamination of Raji with 3C3L-1. 
Demonstration that RCC-1 was not contaminated with 
BCBL-1 was obtained by PCR typing of HLA-DR alleles 
[27] {Raji and RCC-1: DR/?1*0310, DR03*O2; BCBL-1: 
DR01O4,*O7, Dr/?4*01> and confirmed by flow cytometry 
to determine the presence (Raji, RCC1 ) or absence 
{BCBL-1 ) of EMA membrane antigen. Clonal sublines of 
RCC-1 were obtained by dilution in 36 well plates to 
0.1 cells/well in RPMI 1640, 20% FCS and 30% T-STIM 
culture supplement (Collaborative Biomedical Products, 
Bedford MA) . Subcultures were examined to ensure that 
each was derived from a single cluster of growing 
cells. 



In situ hybridization was performed with a previously 

20 described 25 bp oligomer located in ORF26 which was 5' 

labeled with fluorescein (Operon, Alameda CA) and 
hybridized to cytospin preparations of BCBL-l , RCC-1 
and Raji ceils using the methods of Lungu et al . [2 9] 
Slides were both directly visualized by UV microscopy 

25 an< 3 by incubating slides with ant i - fluorescein - 

alkaline phosphatase {AP } - con j uga ted antibody 
(Boehringer-Kannheim, Indianapolis IN) , allowing 
immunohistochemical defect ion cf bound orobe . 
Positive control hybridization was performed using a 

30 2 6 bp TET- labeled E3V DNA polymerase gene cliaomer 

(Applied Biosystems, Alameda CA) which was visualized 
by UV microscopy only and neaative control 
hybridization was performed using a 25 bo 5' 
f luorescein - labeled HSVi a 4 7 gene oligomer (Ooeron, 

3 5 Alameda CA) which was visualized in a similar manner 

as the KSHV ORF26 probe. All nuclei cf 3C3L-1, RCC-1 
and Raji appropriately stained with the E3V 
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hybridization probe whereas no specific seaming zz 
the cells occurred after hybridization with tne K3V- 
probe . 

The remaining suspension cell lines used in 
transmission experiments were pelleted, and 
resuspended in 5 ml of 0.22 or 0.45 n filtered BC3L-1 
tissue culture supernatant for 16 h. BCSL-1 
supernatants were either from unstimulated cultures or 
from cultures stimulated with 20 ng/ml TPA . No 
difference in transmission to recipient cell lines was 
noted using various filtration or stimulation 
conditions. Fetal cord blood lymphocytes ( FCBL) were 
obtained from heparinized fresh post-partum umbilical 
cord blood after separation on Ficoll-Paque (Pharmacia 
LKB, Uppsala Sweden) gradients and cultured in RPMI 
1640 with 10% fetal calf serum. Adherent recipient 
cells were washed with sterile Hank's Buffered Salt 
Solution (HBSS, Gibco-BRL, Gaithersburg MD) ^and 
overlaid with 5 ml of BC3L-1 media supernatant. After 
incubation with BCBL.-l media supernatant, cells were 
washed three times with sterile HBSS, and suspended in 
fresh media. Cells were subsequently rewashed three 
times every other day for six days and grown for at 
least two weeks prior to DNA extraction and testing. 
PCR to detect KSHV infection was performed using 
nested and unnested primers from ORF 26 and CRF 25 as 
previously described [10, 35] . 

T^^r- !^,^i l!fl re S c^. Assav = AIDS-KS sera were 
obtained from ongoing cohort studies (provided by Drs . 
Scott Holmberg, Thomas Spira and Harold Jaffe, Centers 
for Disease Control. and Prevention, and Isaac 
Weisfuse. New York City Department of Health) . Sera 
from AIDS-KS patients were drawn between 1 and 2 1 
months after initial KS diagnosis, sera from 
intravenous drug user and homosexual/bisexual controls 
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were drawn after non-KS AIDS diagnosis, and sera from 
HIV-infected hemophiliac controls were drawn at 
various times after HIV infection. Immunofluorescence 
assays were performed using an equal volume mixture of 
goat anti-human IgG-FITC conjugate (Molecular Probes, 
Eugene OR) and goat anti-human IgM-FITC conjugate 
(Sigma Chemical Co., St. Louis MO) diluted 1:100 and 
serial dilutions of patient sera. End-point titers 
were read blindly and specific immunoglobulin binding 
was assessed by the presence or absence of a specular 
fluorescence pattern in the nuclei of the plated 
cells. To adsorb cross-reacting antibodies, 20 fil 
.serum diluted 1:10 in phosphate-buffer saline (PBS) 
pH 7.4, were adsorbed with l-3xl0 7 paraf ormaldehyde - 
15 fixed P3H3 cells for 4-10 h at 25° C and removed by 

low speed centrif ugation . P3H3 were induced prior to 
fixation with 20 ng/ml TPA for 48 h , fixed with 1% 
paraformaldehyde in PBS for 2 h at 4° C, and washed 
three times in PBS prior to adsorption. 



RESULTS 



Seaue nce Analysis of a 20.7 kb KSHV DNA Sequence: 

To demonstrate that KS330Bam and KS631Bam are genomic 

25 fragments from a new and previously uncharacterized 

herpesvirus, a lambda phage clone (KS5) derived from 
an AIDS-KS genomic DNA library was identified by 
hybridization to the KS3 3 OBam sequence. The KS5 
insert was subcloned after Nozl/3an£iI digestion into 

3 0 five subfragments and both strands of each fragment 

were sequenced by primer walking or nested deletion 
with a 7- fold average redundancy. The KS5 sequence is 
20,705 bp in length and has a G-C content cf 54.0%. 
The observed/expected CpG dinucieotide ratio is 0.52 

3 5 indicating no overall CpG suppression in this recrion. 
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Open reading frame (ORF) analysis identified 15 
complete ORFs with coding regions ranging from 231 bp 
no 4 12 8 bp in length, and two incomplete ORFs at the 
termini of the KS5 clone which were 135 and 552 bp in 
length {Figure 12). The coding probability cf each 
ORF was analyzed using GRAIL 2 and CodonPref erence 
which identified 17 regions having excellent to good 
protein coding probabilities. Each region is within 
an ORF encoding a homolog to a known herpesvirus gene 
with the exception of one ORF located at the genome 
position corresponding to ORF28 in herpesvirus saimiri 

(HVS) . Codon preference values for all of the ORFs 
were higher across predicted ORFs than in non-coding 
regions when using a codon table composed of KS5 
homclogs to the conserved herpesvirus major capsid 

(MCP) , glycoprotein H (gH) , thymidine kinase (TK) , and 
the putative DNA packaging protein (ORF2 9a/ORF29b) 
genes . 

The translated sequence cf each ORF was used to search 
GenBank/EMBL databases with BLASTX and FastA 
algorithms [2, 38]. Ail of the putative KS5 ORFs, 
exceDt one, have sequence and coiimear positional 
homology to ORFs from gamma- 2 herpesviruses, 
especially HVS and equine herpesvirus 2 (EnV2). 
Because of the high degree of coliinearity and ammo 
acid sequence similarity between KSKV and HVS, KSHV 
ORFs have been named according to their HVS positional 
homologs (i.e. KSKV ORF25 is named after KVS ORF 25). 



The KS5 sequence spans a region which includes tr.ree 
of the seven conserved herpesvirus gene blocks (Figure 
14) [10]. ORFs present in these blocks include genes 
35 which encode herpesvirus virion structural proteins 

and enzymes involved in DNA metabolism and 
real i cat ion . 



Ammo acid identities between KS5 ORFs 
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and HVS ORFs range from 30% to 60%, with the conserved 
MC? ORF2 5 and ORF2 9b genes having the highest 
percentage amino acid identity to homologs in other 
gammaherpesviruses . KSHV ORF28, which has no 

5 detectable sequence homology to HVS or E3V genes, has 

positional homology to HVS ORF28 and E3V BDLF3 . ORF28 
lies at the junction of two gene blocks (Figure 14); 
these junctions tend to exhibit greater sequence 
divergence than intrablock regions among herpesviral 

10 genomes [17] . Two ORFs were identified with sequence 

homology to the putative spliced protein packaging 
genes of HVS (ORF29a/ORF2 9b) and herpes simplex virus 
type 1 (UL15) . The KS330Bam sequence is located 
within KSHV ORF2 6, whose HSV-l counterpart, VP23, is 

15 a minor virion structural component. 

For every KSHV homolog, the HVS amino acid similarity 
spans the entire gene product, with the exception of 
ORF21, the TK gene. The KSHV TK homolog contains a 

20 proiine-rich domain at its amino terminus (nt 20343- 

19636 ; aa 1-236) that is not conserved in other 
herpesvirus TK sequences, while the carboxyl terminus 
(nt 19637-18601; aa 237-565) is highly similar to the 
corresponding regions of HVS , EHV2 , and bovine 

25 herpesvirus 4 (BHV4) TK . A purine binding motif with 

a glycine- rich region found in herpesviral TK genes, 
as well as other TK genes, is present in the KSHV TK 
homolog (GVMGVGKS ; aa 26 0-267) . 

30 The KS5 translated amino acid sequences were searched 

against the PROS ITE Dictionary cf Protein Sites and 
Patterns (Dr. Amos Bairoch, University cf Geneva, 
Switzerland) using the computer program Motifs. Four 
sequence motif matches were identified among KSHV 

3 5 hypothetical protein sequences. These matches 

included: (i) a cytochrome c family heme -binding motif 
in ORF33 (CVKCKQ; aa 209-214} and ORF34 (CLLCHI; aa 
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257-261), (ii) an immunoglobulin and maicr 
histocompatibiiity complex protein signature in GRF25 
(FICQAKK; aa 1024-1030), (iii) a mitochondrial energy 
transfer protein motif in ORF26 { PDDITRMRV aa 260- 
268), and (iv) the purine nucleotide binding site 
identified in ORF21. The purine binding motif is the 
only motif with obvious functional significance. A 
cytosine-specific methylase motif present in HVS ORF27 
is not present in KSHV ORF27. This motif may play a 
role in- the methylation of episomal DNA in cells 
persistently infected with HVS [1] . 



phyincTPnet:^ analysis of KSHV : Amino acid sequences 
translated from the KS5 sequence were aligned with 
corresponding sequences from other herpesviruses. On 
the basis of the level of conserved aligned residues 
and the low incidence of introduced gaps, the amino 
acid alignments for ORFs 21, 22, 23, 24, 25, 26, 29a, 
29b, 

20 analyses. 



31 and 34 were suitable for phyiogenetic 



To demonstrate the phyiogenetic relationship of KSHV 
to other herpesviruses, a single-gene comparison was 
made for ORF25 (MCP) homologs from KS5 and twelve 
25 members of Herpesviridae {Figures 15A-153). The 

thirteen available MCP amino acid sequences are large 
(1376 a. a. residues for the KSHV homoiog) and 
alignment required only a low level of gapping ; 
however, the overall similarity between viruses is 
30 relatively low [33]. The MCP set gave stable trees 

with high bootstrap scores and assigned the KSHV 
homoiog to the gamma-2 sublineage {genus Rhadmo^rus 
), containing HVS, EHV2 and 3VH4 [20, 33, 43]. KSHV 
was most closely associated with HVS . Similar results 
35 were obtained for single-gene alignments of TK ana 

UL15/ORF29 sets but with lower bootstrap scores so 
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that among gamma -2 herpesvirus members branching 
orders for EHV2 , HVS and KSHV were not resolved. 

To determine the relative divergence between KSHV and 
5 other gammaherpesviruses, alignments for the nine 

genes listed above were concatenated to produce a 
combined gammaherpes virus gene set (CS1) containing 
EBV, EHV2, HVS and KSHV amino acid sequences. The 
total length of CS1 was 4247 residues after removal of 

10 positions containing gaps introduced by the alignment 

process in one or more of the sequences. The CS1 
alignment was analyzed by the ML method, giving the 
tree shown in Figure 15B and by the MP and NJ methods 
used with the aligned herpesvirus MCP sequences. All 

15 three methods identified KSHV and HVS as sister 

groups, confirming that KSHV belongs in the gamma- 2 
sublineage with HVS as its closest known relative. It 
was previously estimated that divergence of the HVS 
and EHV2 lineages may have been contemporary with 

20 divergence cf the primate and ungulate host lineages 

[33] . The results for the CS1 set suggest that HVS 
and KSHV represent a lineage of primate herpesviruses 
and, based on the distance between KSHV and HVS 
relative to the position of EHV2 , divergence between 

25 HVS and KSHV lines is ancient. 

Genomic Studies of KSHV: 

CHEF electrophoresis performed on 3C3L-1 cells 
embedded in agarose plugs demonstrated the presence of 
30 a nonintegrated KSHV genome as well' as a high 

molecular weight species (Figures 1SA-16B) . PCS 6 3 13am 

(Figure 16A) and KS3 3 OBam specifically hybridized to 
a single CHEF gel band comigrating with 270 ki 1 chase 

( kb ) linear DNA standards. The majority of 

3 5 .hybridizing DNA was present in a diffuse band at the 

well origin; a low intensity high molecular weight 

(HMW) band was also present immediately below the 
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origin (Figure 16A. arrow) . The same filter was 
stripped and probed with an EBV terminal repeat 
seauence [40] yielding a 150-160 kb band (Figure 163/ 
corresponding to linear EBV DNA [24]. The HMW EBV 
band may correspond to either circular or cone at emerge 
EBV DNA [24] . 



o 



The phorbol ester TPA induces replication- competent 
EBV to enter a lytic replication cycle [4 9] . 
determine if TPA induces replication of KSKV and E3V 
in BCBL-1 cells, these cells were incubated with 
varying concentrations of TPA for 48 h (Figure 17). 
Maximum stimulation of EBV occurred at 20 ng/mi TPA 
which resulted in an eight-fold increase in 
hybridizing EBV genome. Only a 1.3-1.4 fold increase 
in KSHV genome abundance occurred after 20-80 ng/ml 
TPA incubation for 48 h. 

Transmission Studies: 

Prior to determining that the agent was likely to be 
a member of Herpesviridae by sequence analysis, 3CBL-1 
cells were cultured with Raji cells, a nonlytic EBV 
transformed B cell line, in chambers separated by a 
0.45 u tissue culture filter. Recipient Raji ceils 
25 generally .demonstrated rapid cytolysis suggesting 

transmission of a cytotoxic component from the 3C5L-1 
cell line. One Raji line cultured in 10 ng/ml TPA for 
2 days, underwent an initial period of cytolysis 
before recovery and resumption of logarithmic grower. 
This cell line (RCC-1) is a monoculture derived from 

Raji uncontammatec oy 3^-: as a 

amplification cf H1A-DR sequences. 

sitive for the KB 2 3 C :3: . ?CR proauct 



RCC-1 has remained po 

.irruous cul 

nassaaes), but KSKV was not detectable by act 



for >6 months in continuous culture (approximately 70 
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hybridization, however, with a 25 bp KSHV ORF2 6- 
deriveci oligomer was used to demonstrate persistent 
localization of KSHV DNA to RCC-1 nuclei. As 
indicated in Figures 18A-18C, nuclei of BC3L-1 and 
5 RCC-1 (from passage -65) cells had detectable 

hybridization with the ORF26 oligomer, whereas no 
specific hybridization occurred wirh parental Raji 
cells {Figure 18B) . KSHV sequences were detectable in 
€5% of BCBL-1 and 2.6% of RCC-1 cells under these 

10 conditions. In addition, ■ forty-five monoclonal 

cultures were subcultured by serial dilution from RCC- 
1 at passage 50, of which eight (18%) clones were PCR 
positive by KS33 0 23 ,. While PCR detection using 
unnested KS330 233 primer pairs was lost by passage 15 

15 in each of the clonal cultures, persistent KSHV genome 

was detected in 5 clones using two more sensitive 
nonoverlapping nested PCR primer sets [33] suggesting 
that KSHV genome is lost over time in RCC-1 and its 
clones . 

20 

Low but persistent levels of KS330 235 PCR positivity 
were found for one cf four Raji, one of four Bjab, two 
of three Molt -3, one cf one owl monkey kidney cell 
lines and three of eight human fetal cord blood 

25 lymphocyte (FC3L) cultures after inoculation with 0.2- 

0.4 5 m filtered BC3L-1 supernatants . Among the PCR 
positive cultures, PCR detectable genome was lost 
after 2-6 weeks and multiple washings. Five FC3L 
cultures developed cell clusters characteristic cf E3V 

30 immortalized lymphocytes and were positive for E3V by 

PCR using E3ER primers [23) ; three of these cultures 
were also initially KS3 3 C : ,, positive. None cf the 
recipient cell lines had detectable P'.SKV genome bv dot 
blot hybridization. 



35 



WO 96/15779 



PCr/VS95/15138 



133 



10 



15 



20 



35 



indirect immunofluorescence antibody assays (IFA) were 
used to assess the presence of specific ar.-.isssies 
against the KSHV- and EBV- infected cell line BHL-6 ir. 
the sera from AIDS-KS patients and control patients 
with HIV infection or AIDS. BKL-6 was substituted for 
BCBL-1 for reasons of convenience; preliminary studies 
showed no significant differences in IFA results 
between BHL-6 and BCBL-1. BHL-6 have diffuse 

immunofluorescent cell staining with most KS patient 
and control unabsorbed sera suggesting nonspecific 
antibody binding (Figures 19A-19D). After adsorption 
with paraformaldehyde- fixed, TPA-induced F3K3 (an EBV 
.producer subline of P3J-HR1, a gift of Dr. George 
Miller) to remove cross-reacting antibodies against 
EBV and lymphocyte antigens, patient sera generally 
showed specular nuclear staining at high titers while 
this staining pattern was absent from control patient 
sera (Figures 19B and 19D) . Staining was localized 
primarily to the nucleus but weak cytoplasmic staining 
was also present at low sera dilutions. 



WH-h unadsorbed sera, the initial endpoint geometric 
mean titers (GMT) against BHL-6 cell antigens for the 
sera from AIDS-KS patients (GMT-1:1153. range: 1:1=0 
25 to 1:12.150) were higher than for sera from control, 

non-KS oatiencs (GMT=1:342; range 1:50 to 1:12.150; 
p.0. 04) (Figure 13). While AIDS-KS patients and HIV- 
infected gay/bisexual and intravenous drug user 
control patients had similar endpoint titers to BHL-S 
30 antigens (GMT=1:12€5 and GMT = 1 : 1575 .. respectively) , 

hemophilic AIDS patient titers were lower (GKT-l^OO . 
Both case and control patient groups had elewtec ,.-A 
titers against the EBV infected ceil line F3H3 . 



The difference in endpoint GMT between case and 
control titers against BHL-6 antigens increased alter 
adsorption with P3K3 . After adsorption. case GMT 
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declined to 1:780 and control GMT declined to 1:61 
(p = 0. 00009). Similar results were obtained by using 
BC3L-1 instead of BHL-6 cells, by pre - adsorbing with 
EBV- infected nonproducer Ra j i cells instead of P3K3 
5 and by using sera from a homosexual male KS patient 

without HIV infection, in complete remission for KS 
for 9 months (BHL-6 titer 1:450, P3K3 titer 1:150). 
Paired sera taken 8-14 months prior to KS onset and 
after KS onset were available for three KS patients: 
10 KS patients 8 and 13 had eight-fold rises and patient 

8 had a three- fold fall in P3H3 -adsorbed BCBL-1 titers 
from pre-onset sera to post-KS sera. 

DISCUSSION 

15 These studies demonstrate that specific DNA sequences 

found in KS lesions by representational difference 
analysis belong to a newly identified human 
herpesvirus. The current studies define this agent as 
a human gamma -2 herpesvirus that can be continuously 

20 cultured in naturally- transformed , EBV- co infected 

lymphocytes from AIDS-related body-cavity based 
lymphomas . 

Sequence analysis of the KS5 lambda phage insert 
25 provides clear evidence that the KS3303am sequence is 

part of a larger herpesvirus genome. KS5 has a 54.0% 
G+ C content which is considerably higher than the 
corresponding HVS region (34.3% G+C ) . While there is 
no CpG dinucleotide suppression in the KS5 sequence, 
30 the corresponding KVS region has a 0.33 

expected : observed CpG dinucleotide ratio [1J . The CpG 
dinucleotide frequency in herpesviruses varies from 
global CpG suppression among gammaherpesviruses to 
local CpG suppression in the betaherpesviruses , which 
35 may result from deamination of 5 methyl cytosine 

residues at CpG sites resulting in TpG substitutions 
[21] . CpG suppression among herpesviruses [21, 30, 
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44] has been hypothesized to reflect co-replicatior. c: 
latent genome in actively dividing host cells, but it 
is unknown whether or not KSHV is primarily maintained 
by a lytic replication cycle in vivo. 

The 20.705 bp KS5 fragment has 17 protein-coding 
regions, 15 of which are complete ORFs with 
appropriately located TATA and polyadenyiation 
sianals, and two incomplete ORFs located at the phage 
insert termini. Sixteen of these ORFs correspond by 
sequence and collinear positional homology to 15 
previously identified herpesviral genes including the 
highly conserved spliced gene. The conserved 

positional and sequence homology for KSHV genes in 
this region are consistent with the possibility that 
the biological behavior of the virus is similar to 
that of other gammaherpesviruses . For example, 
identification of a thymidine kinase-like gene on KS5 
implies that the agent is potentially susceptible to 
TK-activated DNA polymerase inhibitors and like other 
herpesviruses possesses viral genes involved in 
nucleotide metabolism and DNA replication [41] . The 
presence of major capsid protein and glycoprotein H 
aene homoiogs suggest that replication competent virus 
would produce a capsid structure similar to other 
herpesviruses . 

Phylogenetic analyses cf molecular sequences show that 
KSHV belongs to the gamma- 2 sublineage of cne 
Gammaherpesvirinae subfamily, and is thus the ^ first 
human gamma-2 herpesvirus identified. Its closest 
known relative based on available sequence comparisons 
is KVS, a squirrel monkey gamma-2 herpesvirus that 
causes fulminant polyclonal T cell lymphcprcliferative 
disorders in some New world monkey species. Data tor 
-he gamma-2 sublineage are sparse: only three viruses 
(KSHV, HVS and EHV2 ) can at present be placed or. tne 
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phylogenetic tree with precision (the sublineage also 
contains murine herpesvirus 6 8 and BHV4 [3 3 j ) . Giver, 
the limitation in resolution imposed by this thin 
background, KSHV and HVS appear to represent a lineage 
5 of primate gamma-2 viruses. Previously, McGeoch et 

al . [33] proposed that lines of gamma-2 herpesviruses 
may have originated by cospeciation with the ancestors 
of their host species. Extrapolation of this view to 
KSHV and HVS suggests that these viruses diverged at 

10 an ancient time, possibly contemporaneously with the 

divergence of the Old World and New World primate host 
lineages. Gammaherpesviruses are distinguished as a 
subfamily by their lyrophotrophism [41] and this 
grouping is supported by phylogenetic analysis based 

15 on sequence data [33] . The biologic behavior of KSHV 

is consistent with its phylogenetic designation in 
that KSHV can be found in in vitro lymphocyte cultures 
and in in vivo samples of lymphocytes [3] . 

20 This band appears to be a linear form of the genome 

because other "high molecular weight" bands are 
present for both EBV and KSHV in BC3L-1 which may 
represent circular forms of their genomes. The linear 
form of the EBV genome, associated with replicating 

25 and packaged' DNA [41] migrates substantially faster 

than the closed circular form associated with latent 
viral replication [24] . While the 270 kb band appears 
to be a linear form, it is also consistent with a 
replicating dimer piasmid Since the genome size cf HVS 

30 is approximately 135 kb . The true size of the genome 

may only be resolved by ongoing mapping and sequencing 
studies . 

Replication deficient EBV mutants are common among EBV 
35 strains passaged through prolonged tissue culture 

[23] . The E3V strain infecting Raj i , for example, is 
an BALF-2 deficient mutant [15] ; virus replication is 
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nor inducibile with TPA and its genome is maintained 
only as a latent circular form [23, 33]. The £3V 
strain coinfecting BCBL-1 does not appear to be 
replication deficient because TPA induces eight -fold 
5 increases in DNA content and has an apparent linear 

form on CHEF electrophoresis. KSHV replication, 
however, is only marginally induced by comparable TPA 
treatment indicating either insensitivity to TPA 
induction or that the genome has undergone loss of 
10 genetic elements required for TPA induction. 

Additional experiments, however, indicate that KSHV 
DNA can be pelleted by high speed centrif ugat ion of 
filtered organelle-f ree , DNase I -protected BC3L-1 ceil 
extracts, which is consistent with KSHV encapsidat ion . 

Transmission of KSHV DNA from BCBL-1 to a variety of 
recipient cell lines is possible and KSHV DNA can be 
maintained at low levels in recipient cells for up to 
70 passages. However, detection of virus genome in 
20 recipient cell lines by PCR may be due to physical 

association of KSHV DNA fragments rather than true 
infection. This appears to be unlikely given evidence 
for specific nuclear localization of the ORF26 
sequence in RCC-I. If transmission of infectious 
25 virus from BCBL-1 occurs, it is apparent that the 

viral genome declines in. abundance with subsequent 
passages of recipient cells. This is consistent with 
studies of spindle cell lines derived from KS lesions. 
Spindle ceil cultures generally have PCR detectable 
30 KSHV genome when first expianted, but rapidly lose 

viral genome after initial passages and established 
spindle cell cultures generally do not have detectable 
KSHV sequences [3] . 

35 Infections with the human herpesviruses are generally 

ubiquitous in that nearly all humans are infected by 
earlv adulthood with six of the seven previously 
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identified human herpesviruses [42] . Universal 
infection with EBV, for example, is the primary reason 
for the difficulty in clearly establishing a causal 
role for this virus in EBV-associated human tumors. 
5 The serologic studies identified nuclear antigen in 

BCBL-1 and BHL-6 which is recognized by sera from 
AIDS-KS patients but generally not by sera from 
control AIDS patients without KS after removal of E3V- 
reactive antibodies. These data are consistent with 

10 PCR studies of KS and control patient lymphocytes 

suggesting that KSHV is not ubiquitous among adult 
humans, but is specifically associated with persons 
who develop Kaposi's sarcoma. In this respect, it 
appears to be epidemiologically similar to HSV2 rather 

15 than the other known human herpesviruses. An 

alternative possibility is that elevated I FA titers 
against BCBL-1 reflect disease status rather than 
infection with the virus. 
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EXPERIMENTAL DETAILS SECTION III: 

KS Patient Enrollment : Cases and controls were 
selected from ongoing cohort studies based on the 
5 availability of clinical information and appropriate 

PBMC samples. 21 homosexual or bisexual men with AIDS 
who developed KS during their participation in 
prospective cohort studies were identified [14-16]. 
Fourteen of these patients had paired PBMC samples 

10 collected after KS diagnosis (median +4 months) and at 

least four months prior to KS diagnosis (median -13 
months) , while the remaining 7 had paired PBMC taken 
at the study visit immediately prior to KS diagnosis 
(median -3 months) and at entry into their cohort 

15 study (median -51 months prior .to KS diagnosis) . 

Hemophilic and Homosexual /Bisexual Male AIDS Patient 
Control Enrollment: Two control groups of AIDS 
patients were examined: 23 homosexual/bisexual men 

20 with AIDS followed until death who did not develop KS 

{"high risk" control group) from the Multicenter AIDS 
Cohort Study [16]), and 19 hemophilic men ("low risk" 
control group) enrolled from joint projects of the 
National Hemophilia Foundation and the Centers for 

25 Disease Control and Prevention. Of the 16 hemophilic 

controls with available follow-up information, none 
are known to have developed KS and <2% of hemophilic 
AIDS patients historically develop KS [2] . For 
homosexual /bisexual AIDS control patients who did not 

30 develop KS, paired PBMC specimens were available at 

entry into their cohort study (median -3 5 months prior 
to AIDS onset) and at the study visit immediately 
prior to nonKS AIDS diagnosis (median BHL.-6 months 
prior to AIDS onset) . 



DNA Extraction and Analyses: DNA from 10 € -10 7 PBMC in 
each specimen was extracted and quantitated by 
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spectrophotometry. Samples were prepared in 

physically isolated laboratories from the laboratory 
where polymerase chain reaction (PCR) analyses were 
performed. All samples were tested for amplif iability 
5 using primers specific for either the HLA-DQ locus 

(GH26/GH27) or b-globin [18]. PCR • detect ion of KSHV 
DNA was performed as previously described [7] with the 
following nested primer sets: No. l outer 5'- 
AGCACTCGCAGGGCAGTACG - 3 ' , 5 ' - GACTCTTCGCTGATGAACTGG - 3 ' ; 
10 No . 1 inner 5 ' - TCCGTGTTGTCTACGTCCAG - 3 ' , 5 ' - 

AGCCGAAAGGATTCCACCAT - 3 ' ; No . 2 outer 5 ' - 

AGGCAACGTCAGATGTGAC- 3 ' , 5 ' - GAAATTACCCACGAGATCGC- 3 ' ; 
No . 2 inner 5 ' - CATGGGAGTACATTGTCAGGACCTC- 3 ' , 5 ' - 
GGAATTATCTCGCAGGTTGCC-3 ' ; No. 3 outer 5'- 

15 GGCGACATTCATCAACCTCAGGG-3 ' , 5' - 

ATATCATCCTGTGCGTTCACGAC- 3 ' ; No . 3 inner 5 ' - 
CATGGGAGTACATTGTCAGGACCTC - 3 ' , 5' - 

GGAATTATCTCGCAGGTTGCC-3'. The outer primer set was 
amplified for 35 cycles at 94° C for 30 seconds, 60° 
20 C for 1 minute and 72° C for 1 minute with a 5 minute 

final extension cycle at 72° C. One to three ml of 
the PCR product was added to the inner PCR reaction 
mixture and amplified for 25 additional cycles with a 
5 minute final extension cycle. Primary determination 
25 of sample positivity was made with primer set No. 1 

and confirmed with either primer sets 2 or 3 which 
amplify nonoverlapping regions of the KSHV 
hypothetical major capsid gene. Sampling two portions 
of the KSHV genome decreased the likelihood of 
30 intraexperimental PCR contamination. These nested 

primer sets are 2-3 logs more sensitive for detecting 
KSHV sequences than the previously published KS330 233 
primers [6] and are estimated to be able to detect <10 
copies of KSHV genome under optimal conditions. 
35 Sample preparations were prealiguoted and amplified 

with alternating negative control samples without DNA 
to monitor and control possible contamination. All 
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samples were tested in a blinded fashion and a 
determination of the positivity/negativity made before 
code breaking. Significance testing was performed 
with Mantel-Haenszel chi-sguared estimates and exact 
5 confidence intervals using Epi-Info ver. 6 (USD Inc., 

Stone Mt . GA) . 



RESULTS 



10 



KSHV Positivitv of Case and Control PBMC Samples: 
Paired PBMC samples were available from each KS 
patient and homosexual /bisexual control patient; a 
single sample was available from each hemophilic 
15 control patient. 

To determine the KSHV positivity rate for each group 
of AIDS patients, a single specimen from each 
participant taken closest to KS or other AIDS -defining 

20 illness ("second sample") was analyzed. Overall, 12 

of 21 (57%) of PBMC specimens from KS patients taken 
from 6 months prior to KS diagnosis to 2 0 months after 
KS diagnosis were KSHV positive. There was no 
apparent difference in positiviry rate between 

25 immediate pre-diagnosis and post -diagnosis visit 

specimens (4 of 7 (57%) vs. 8 of 14 (57%) 
respectively) . 

The number of KSHV positive control PBMC specimens 
30 from both homosexual/bisexual (second visit) and 

hemophilic patient controls was significantly lower. 
Only 2 of 19 (11%) hemophilic PBMC samples were 
positive (odds ratio 11.3, 95 % confidence interval 
1.8 to 118) and only 2 of 23 (9%) PBMC samples from 
3 5 homosexual /bisexual men who did not develop KS were 

positive (odds ratio 14.0, 95% confidence interval 2.3 
to 144). If all KS patient PBMC samples taken 
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immediately prior to or after diagnosis were truly 
infected, the PCR assay was at least 57% sensitive in 
detecting KSHV infection among PBMC samples . No 
significant differences in CD4 + counts were found for 
5 KS patients and homosexual /bisexual patients without 

KS at the second sample evaluation (Kruskall -Wallis 
p=0.15) (Figure 21). CD4+ counts from the single 
sample from hemophilic AIDS patients were higher than 
CD4 + counts from KS patients (Kruskall -Wallis 
10 p=0.004), although both groups showed evidence of HIV- 

related immunosuppression. 



Longitudinal Studies : 

15 Paired specimens were available from all 21 KS 

patients and 23 homosexual /bisexual male AIDS control 
patients who did not develop KS . For the KS group, 
initial PBMC samples were taken four to 87 months 
(median 13 months) prior to the onset of KS . Initial 

20 PBMC samples from the control group were drawn 13 to 

106 months (median 55 months) prior to onset of first 
nonKS AIDS-def ining illness (19B7 CDC surveillance 
definition) . 11 of 21 (52%) of KS patients had 
detectable KSHV DNA in PBMC samples taken prior to KS 

25 onset compared to 2 of 19 (11%, p=0.005) hemophilic 

control samples, and 1 (4%, p=0.0004) and 2 (9%, 
p=0,002) of 23 homosexual /bisexual control samples 
taken at the first and second visits respectively 
(Figures 20A-20B) . The figure shows that 7 of the 

30 paired KS patient samples were positive at both 

visits, 5 KS patients and 2 control patients converted 
from negative to positive and two KS patients and one 
control patient reverted from positive to negative 
between visits. The remaining 7 KS patients and 20 

35 control patients were negative at both visits. 
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For the 5 KS patients that converted from an initial 
negative PBMC result to a positive result at or near 
to KS diagnosis, the median length of time between the 
first sample and the KS diagnosis was 19 months. 
5 Three of the 6 KS patients than were negative at both 

visits had their last PBMC sample drawn 2-3 months 
prior to onset of illness. It is unknown whether 
these patients became infected between their last 
study visit and the KS diagnosis date. 

10 

DISCUSSION 

Ambroziak and coworkers have found evidence that KSHV 
preferentially infects CD19+ B cells by PBMC subset 
examination of three patients [19] . Other 
15 gammaherpesviruses, such as Epstein-Barr virus (EBV) 

and herpesvirus saimiri are also lymphotrophic 
herpesviruses and can cause lymphoprolif erat ive 
disorders in primates [11, 20]. 

20 It is possible that KSHV , like most human 

herpesviruses, is a ubiquitous infection of adults 
[21] . EBV, for example, is detectable by PCR in CD19 + 
B lymphocytes from virtually all seropositive persons 
[22] and approximately 98% MACS study participants had 

25 EBV VCA antibodies at entry into the cohort study 

[23] . The findings, however, are most consistent with 
control patients having lower KSHV infection rates 
than cases and that KSHV is specifically associated 
with the subsequent development of KS . While it is 

30 possible that control patients are infected but have 

an undetectably low KSHV viral PBMC load, the 
inability to find evidence of infection in control 
patients under a variety of PCR conditions suggests 
that the majority of control patients are not 

35 infected. Nonetheless, approximately 10% of these 

patients were KSHV infected and did not develop KS . 
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It is unknown whether or not this is similar to the 
KSHV infection rate for the general human population. 

This study demonstrates that KSHV infection is both 
5 strongly associated with KS and precedes onset of 

disease in the majority of patients. 57% of KS 
patients had detectable KSHV infection at their second 
follow-up visit {52% prior to the onset of KS] 
compared to only 9% of homosexual /bisexual fp=0.002) 
10 and 11% of hemophilic control patients (p=0.005). 

Despite similar CD4 + levels between 

homosexual/bisexual KS cases and controls, KSHV DNA 
positivity rates were significantly higher for cases 
at both the first (p=0.005) and second sample visits 
indicating that immunosuppression alone was not 
responsible for these elevated detection rates. It is 
also unlikely that KSHV simply colonizes existing KS 
lesions in AIDS patients since neither patient group 
had KS at the time the initial sample was obtained. 
Five KS patients and two homosexual /bisexual control 
patients converted from a negative to a positive, 
possibly due to new infection acquired during the 
study period. 

25 The findings are in contrast to PCR detection of KSHV 

DNA in all 10 PBMC samples from KS patients by 
Ambroziak et al . [19]. It is possible that the assay 
was not sensitive enough to detect virus in all 
samples since it was required that each positive 
sample to be repeatedly positive by two independent 
primers in blinded PCR assays. This appears unlikely, 
however, given the sensitivity of the PCR nested 
primer sets. The 7 KS patients who were persistently 
negative on both paired samples may represent an 
35 aviremic or low viral load subpopulation of KS 

patients. The PCR conditions test a DNA amount 
equivalent to approximately 2xl0 3 lymphocytes; an 



20 



30 
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average viral load less than 1 copy per 2x10- cells may 
be negative in the assay. Two KS patients and a 
homosexual/bisexual control patient initially positive 
for KSHV PCR amplification reverted to negative in 
5 samples drawn after diagnosis. These results probably 

reflect inability to detect KSHV DNA in peripheral 
blood rather than true loss of infection although more 
detailed studies of the natural history of infection 
are needed. 

10 

The study was designed to answer the fundamental 
question of whether or not infection with KSHV 
precedes development of the KS phenotype . The 
findings indicate that there is a strong antecedent 

15 association between KSHV infection and KS . This 

temporal relationship is an absolute requirement for 
establishing that KSHV is central to the causal 
pathway for developing KS . This study contributes 
additional evidence for a possible causal role for 

20 this virus in the development of KS . 
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EXPERIMENTAL DETATT.S SECTION IV: 

To determine if the KHV-KS virus is also present in 
both endemic and HIV-associated KS lesions from 
African patients, formalin- fixed , paraffin-embedded 
tissues from both HIV seropositive and HIV 
seropositive Ugandan KS patients were compared to 
cancer tissues from patients without KS in a blinded 
case -control study. 



Patient Enrollment: Archival KS biopsy specimens were 
selected from approximately equal numbers of KIV- 
associated and endemic HIV-negative KS patients 
enrolled in an ongoing case-control study of cancer 

15 and HIV infection at Makerere University, Kampala 

Uganda. Control tissues were consecutive archival 
biopsies from patients with various malignancies 
enrolled in the same study, chosen without prior 
knowledge of HIV serostatus . All patients were tested 

20 for HIV antibody {measured by Cambridge Bioscience 

Recombigen Elisa assay) . 

Tissue preparation: Each sample examined was from an 
individual patient. Approximately ten tissue sections 

25 were cut (10 micron) from each paraffin block using a 

cleaned knife blade for each specimen. Tissue 
sections were deparaf f inized by extracting the 
sections twice with 1 ml xylene for 15 min. followed 
by two extractions with 100% ethanol for 15 min. The 

30 remaining pellet was then resuspended' and incubated 

overnight at 50° C in 0.5 ml of lysis buffer (25 mM 
KC1, 10 mM Tris-HCl, pH 8.3, 1.4 mM MgC12, 0.01% 
gelatin, 1 mg/ml proteinase K) . DNA was extracted 
with phenol/chloroform, ethanol precipitated and 

35 resuspended in 10 mM Tris-HCl, 0.1 mM EDTA , pH 8.3. 
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PCR Amnl A -H cation: 0.2-0.4 ug of DNA was used in PCR 
reactions with KS3 3 0 233 primers as previously described 
[7] . The samples which were negative were retested by 
nested PCR amplification, which is approximately 10 : - 
5 10 3 fold more sensitive in detecting KS33C 2 ~, 2 sequence 

than the previously published KS330 23 , primer set [7] . 
These samples were tested twice and samples showing 
discordant results were retested a third time. 51 of 
74 samples initially examined were available for 

10 independent extraction and testing at Chester Beatty 

Laboratories, London using identical nested PCR 
primers and conditions to ensure fidelity of the PCR 
results. Results from eight samples were discordant 
between laboratories and were removed from the 

15 analysis as uninterpretable (four positive samples 

from each laboratory) . Statistical comparisons were 
made using EPI-INFO ver. 5 (USD, Stone Mt . GA, USA) 
with exact confidence intervals. 

20 RESULTS : 

Of 66 tissues examined, 24 were from AIDS-KS cases, 2 0 
were from endemic HIV seronegative KS cases, and 22 
were from cancer control patients without KS . Seven 
of the cancer control patients were HIV seropositive 

25 and 15 were HIV seronegative (Figure 22). Tumors 

examined in the control group included carcinomas of 
the breast, ovaries, rectum, stomach, and colon, 
fibrosarcoma, lymphocytic lymphomas, Hodgkin's 
lymphomas, choriocarcinoma and anaplastic carcinoma of 

30 unknown primary site. The median age of AIDS-KS 

patients was 29 years (range 3-50) compared to 36 
years (range 3-79) for endemic KS patients and 38 
years (range 21-73) for cancer controls. 

35 Among KS lesions, 39 of 44 (69%) were positive for 

KS33 0 233 PCR product, including KS tissues from 22 of 
24 (92%) HIV seropositive and 17 of 20 (65%J HIV 
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seronegative patients. In comparison, 3 of 22 (14%) 
nonKS cancer control tissues were positive, including 
1 of 7 (14%) HIV seropositive and 2 of 15 (13%) HIV 
seronegative control patients (Figure 19} . These 
5 control patients included a 73 year old KIV 

seronegative male and a 2 9 year old HIV seronegative 
female with breast carcinomas, and a 36 year old KIV 
seropositive female with ovarian carcinoma. The odds 
ratios for detecting the sequences in tissues from HIV 
10 seropositive and HIV seronegative cases and controls 

was 66 (95% confidence interval (95% C.I.) 3.8-3161) 
and 36.8 (95% C.I. 4.3-428) respectively. The overall 
weighted Mantel-Haenzel odds ratio stratified by HIV 
serostatus was 49.2 (95% C.I. 9.1-335). KS tissues 
15 from four HIV seropositive children (ages 3, 5, 6, and 

7 years) and four HIV seronegative children (ages 3, 
4, 4, and 12 years) were all positive for KS33 0 233 . 

All discordant results (i.e. KSHV negative KS or KSHV 
20 positive nonKS cancers) were reviewed microscopically. 

All KS3 3 0 233 PCR negative KS samples were confirmed to 
be KS. Likewise, all KS3 3 0 233 PCR positive nonKS 
cancers were found not to have occult KS 
hi stopat ho logically . 

25 

DISCUSSION 

These results indicate that KSHV DNA sequences are 
found not only in AIDS-KS [5] , classical KS [6] and 
transplant KS [7] but also in African KS from both HIV 
30 seropositive and seronegative patients. Despite 

differences in clinical and epidemiological features, 
KSHV DNA sequences are present in all major clinical 
subtypes of KS from widely dispersed geographic 
settings. 

35 

This study was performed on banked, formalin - fixed 
tissues which prevented the use of specific detection 
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assays such as Southern hybridization. DNA extracted 
after such treatment is often fragmented which reduces 
the detection sensitivity of PCR and may account for 
the 5 PCR negative KS samples found in the study. The 
5 results, however, are unlikely to be due to PCR 

contamination or nonspecific amplification. Specimens 
were tested blindly and a subset of samples were 
independently extracted and tested at a physically 
separate laboratory. Specimen blinding is essential 

10 to ensure the integrity of results based solely on PCR 

analyses. A subset of amplicons was sequenced and 
found to be more than 98% identical to the published 
KS330 233 sequence confirming their specific nature and, 
because of minor sequence variation, making the 

15 possibility of contamination unlikely. 

In contrast to previous studies in North American and 
European populations, it was found 3 of 22 control 
tissues to have evidence of KSKV infection. Since 

20 these cancers represent a variety of tissue types, it 

is unlikely that KSHV has an etiologic role in these 
tumors. One possible explanation for the findings is 
that these results reflect the rate of KSKV infection 
in the nonKS population in Uganda. Four independent 

25 controlled studies from North America [5 and9 ] Europe 

[7] and Asia [8] have failed to detect evidence of 
KSHV infection in over 200 cancer control tissues, 
with the exception of an unusual AIDS-associated, 
body-cavity-based lymphoma [9] . Taken together, these 

30 studies indicate that DNA-based detection of KSHV 

infection is rare in most nonKS cancer tissues from 
developed countries. KSHV infection has been reported 
in post-transplant skin tumors, although well- 
controlled studies are needed to confirm that these 

3 5 findings are not due to PCR contamination [10] . Since 

Che rate of HIV-negative KS is much more frequent in 
Uganda than the United States, detection of KSHV in 



WO 96/15779 



PCT/L , S95/15I38 



162 

control tissues from cancer patients in the study may 
reflect a relatively high prevalence infection in the 
general Ugandan population. 

5 While KS is extremely rare among children in developed 

countries [2] , the rate of KS in Ugandan children has 
risen dramatically over the past 3 decades: age- 
standardized rates (per 100,000) for boys age 0-14 
years were 0.25 in 1964-68 and 10.1 in 1992-93. 

10 Detection of KSHV genome in KS lesions from 

prepubertal children suggests that the virus has a 
nonsexual mode of transmission among Ugandan children. 
That five of these children were 5 years old or less 
raises the possibility that the agent can be 

15 transmitted perinatally. Whether or not immune 

tolerance due to perinatal transmission accounts for 
the more fulminant form of KS occurring in African 
children remains to be investigated. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: The Trustees of Columbia University in the City 

New York City 

(ii) TITLE OF INVENTION: UNIQUE ASSOCIATED KAPOSI'S SARCOMA VIRUS 

SEQUENCES AND USES THEREOF 

(iii) NUMBER OF SEQUENCES: 45 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE : Cooper & Dunham LLP 

(B) STREET: 118 5 Avenue of the Americas 

(C) CITY: New York 

(D) STATE: New York 

(E) COUNTRY: U.S.A. 

(F) ZIP: 10036 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC- DOS /MS -DOS 

(D) SOFTWARE : Patentln Release #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 



(viii) ATTORNEY / AGENT INFORMATION: 

(A) NAME: White, John P. 

(B) REGISTRATION NUMBER: 28,678 

(C) REFERENCE /DOCKET NUMBER: 4 518 5-D- PCT/ JPW/MSC 

fix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE : (212) 278-0400 

(B) TELEFAX: (212) 391-0525 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20710 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: N 
(iv) ANTI-SENSE: N 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
TCGAGTCGGA GAGTTGGCAC AGGCCTTGAG CTCGCTGTGA CGTTCTCACG GTGTTGGTTG 6 0 

GGATCAGCTG GTGACTCAGA CAAGTCTTGA GCTCTACAAC GTAACATACG GGCTGATGCC 12 0 

CACCCGATAC CAGAATTACG CAGTCGGCAA TTCTGTGCCC TAG AG T CA CC TCAAAGAATA 18 0 

ATCTGTGGTG TCCAAGGGGA GGGTTCTGGG GCCGGCTACT TAGAAACCGC CAT AG AT CGG 24 0 
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GCAGGGTGGA 


GTACTTGAGG 


AGCCGGCGGT 


AGGTGGCCAG 


GTGGGCCCGG 


TTACCTG\-TC 


J 'J J 


TTTTGCGTGC 


TGCTGGAAGC 


CTGCTCAGGG 


ATTT CTTAAC 


CTCGGCCTCG 


GTTGGACGTA 




CCATGGCAGA 


AGGCGGTTTT 


GGAG CGGACT 


CGGTGGGGCG 


CGGCGGAGAA 


AAGGCCTCTo 


420 


TGACTAGGGG 


AGG CAGGTGG 


GACTTGGGGA 


G CT CGGACGA 


CGAATCAAGC 


ACCTCCACAA 


480 


CCAGCACGGA 


TATGGACGAC 


CTCCCTGAGG 


AGAGGAAACC 


ACTAACGGGA 


AAGTCTGTAA 


54 0 


AAACCTCGTA 


CATATACGAC 


GTGCCCACCG 


TCCCGACCAG 


CAAG CCGTGG 


CATTTAATG C 


6 00 


ACGACAACTC 


CCTCTACGCA 


ACGCCTAGGT 


TTCCGCCCAG 


ACCTCTCATA 


CGGCACCCTT 


6 6 0 


CCGAAAAAGG 


CAGCATTTTT 


GCCAGTCGGT 


TGTCAGCGAC 


TGACGACGAC 


TCGGGAGACT 


72 0 


ACGCGCCAAT 


GGATCGCTTC 


GCCTTCCAGA 


GCCCCAGGGT 


GTGTGGTCGC 


CCTCCCCTTC 


780 


CGCCTCCAAA 


TCACCCACCT 


CCGGCAACTA 


GGCCGGCAGA 


CGCGTCAATG 


GGGGACGTGG 


84 0 


GCTGGGCGGA 


TCTGCAGGGA 


CT CAAGAGGA 


CCCCAAAGGG 


ATTTTTAAAA 


ACATCTACCA 


900 


AGGGGGG C AG 


TCTCAAAGCC 


CGTGGACGCG 


ATGTAGGTGA 


CCGTCTCAGG 


GACGGCGGCT 


960 


TTG CCTTTAG 


TCCTAGGGGC 


GTGAAATCTG 


CCATAGGGCA 


AAACATTAAA 


TCATGGTTGG 


1020 


GGATCGGAGA 


ATCATCGGCG 


ACTGCTGTCC 


CCGTCACCAC 


GCAGCTTATG 


GTACCGGTGC 


1080 


ACCTCATTAG 


AACGCCTGTG 


ACCGTGGACT 


ACAGGAATGT 


TTATTTG CTT 


TACTTAGAGG 


1140 


GGGTAATGGG 


TGTGGGCAAA 


TCAACGCTGG 


TCAACGCCGT 


GTGCGGGATC 


TTGCCCCAGG 


1200 


AGAGAGTGAC 


AAGTTTTCCC 


GAGCCCATGG 


TGTACTGGAC 


GAGGGCATTT 


ACAGATTGTT 


1260 


ACAAGGAAAT 


TTCCCACCTG 


ATGAAGTCTG 


GTAAGGCGGG 


AGACCCGCTG 


ACGTCTGCCA 


. 1320 


AAATATACTC 


ATGCCAAAAC 


AAGTTTTCGC 


TCCCCTTCCG 


GACGAACGCC 


ACCGCTATCC 


1380 


TGCGAATGAT 


GCAGCCCTGG 


AACGTTGGGG 


GTGGGTCTGG 


G AGGGG CACT 


CACTGGTGCG 


1440 


TCTTTGATAG 


GCATCTCCTC 


TCCCCAGCAG 


TGGTGTTCCC 


TCTCATGCAC 


CTGAAGCACG 


1500 


GCCGCCTATC 


TTTTGATCAC 


TTCTTTCAAT 


TACTTTCCAT 


CTTTAGAGCC 


ACAGAAGGCG 


1560 


ACGTGGTCGC 


CATTCTCACC 


CTCTCCAGCG 


CCGAGTCGTT 


GCGGCGGGTC 


AGGGCGAGGG 


1620 


GAAGAAAGAA 


CGACGGGACG 


GTGGAGCAAA 


ACTACATCAG 


AGAATTGGCG 


TGGGCTTATC 


1680 


ACGCCGTGTA 


CTGTTCATGG 


ATCATGTTGC 


AG T ACATCAC 


TGTGGAGCAG 


ATGGTACAAC 


1740 


TATGCGTACA 


AACCACAAAT 


ATTCCGGAAA 


TCTGCTTCCG 


CAGCGTGCGC 


CTGGCACACA 


1800 


AGGAGGAAAC 


TTTGAAAAAC 


CTTCACGAGC 


AGAGCATGCT 


ACCTATGATC 


ACCGGTGTAC 


1860 


TGGATCCCGT 


GAGACATCAT 


CCCGTCGTGA 


TCGAGCTTTG 


cttttgtttc 


TTC AC AG AG C 


1920 


TGAGAAAATT 


ACAATTTATC 


GTAGCCGACG 


CGGATAAGTT 


CCACGACGAC 


GTATG CGGCC 


1980 


TGTGGACCGA 


AATCTACAGG 


CAGATCCTGT 


CCAATCCGGC 


T ATT AAA C C C 


AGGG C CATC A 


2040 


ACTGGCCAGC 


ATT AGAG AG C 


CAGTCTAAAG 


CAGTTAATCA 


C CT AG AG GAG 


ACATGCAGGG 


2100 


TCTAGCCTTC 


TTGGCGGCCC 


TTGCATGCTG 


GCGATGCATA 


TCGTTGACAT 


GTGGAGCCAC 


2160 


TGGCGCGTTG 


CCGACAACGG 


CGACGACAAT 


AACCCGCTCC 


GCCACGCAGC 


TCATCAATGG 


2220 


GAGAACCAAC 


CT CT C CAT AG 


AACTGGAATT 


CAACGGCACT 


AGlTi'l'TiTC 


TAAATTGGCA 


2280 
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AAATCTGTTG AATGTGATCA CGGAGCCGGC CCTGACAGAG TTGTGGACCT CCGCCGAAG7 2 34 0 

CGCCGAGGAC CTCAGGGTAA CTCTGAAAAA GAGGCAAAGT CTTTTTTTCC CCAACAAGAC 24 OC 

AGTTGTGATC TCTGGAGACG GCCATCGCTA TACGTGCGAG GTGCCGACGT CGTCGCAAAC 24 6 C 

TTATAACATC AC CAAGGG CT TTAACTATAG CGCTCTGCCC GGGCACCTTG GCGGATTTGG 252 0 

GATCAACGCG CGTCTGGTAC TGGGTGATAT CTTCGCATCA AAATGGTCGC TATTCGCGAG 25 8 0. 

GGACACCCCA GAGTATCGGG TGTTTTACCC AATGAATGTC ATGGCCGTCA AGTTTTCCAT 264 0 

ATCCATTGGC AACAACG AG T CCGGCGTAGC GCTCTATGGA GTGGTGTCGG AAGATTTCGT 2 700 

GGTCGTCACG CTCCACAACA GGTCCAAAGA GGCTAACGAG ACGGCGTCCC ATCTTCTGTT 2 76 0 

CGGTCTCCCG GATTCACTGC CATCTCTGAA GGGCCATGCC AC CTATG ATG AACTCACGTT 28 20 

CGCCCGAAAC GCAAAATATG CGCTAGTGGC GATCCTGCCT AAAGATTCTT ACCAGACACT 2 8 80 

CCTTACAGAG AATT ACACT C GCATATTTCT GAACATGACG GAGTCGACGC CCCTCGAGTT 2 94 0 

CACGCGGACG ATCCAGACCA G GATCGTAT C AATCGAGGCC AGGCGCGCCT GCGCAGCTCA 3000 

AGAGGCGGCG CCGGACATAT TCTTGGTGTT GTTTCAGATG TTGGTGGCAC ACTTTCTTGT 3 06 0 

TGCGCGGGGC ATTGCCGAGC ACCGATTTGT GGAGGTGGAC TGCGTGTGTC GGCAGTATGC 312 0 

GGAACTGTAT TTTCTCCGCC GCATCTCGCG TCTGTGCATG CCCACGTTCA CCACTGTCGG 318 0 

GTATAACCAC ACCACCCTTG GCGCTGTGGC CGCCACACAA ATAGCTCGCG TGTCCGCCAC 324 0 

GAAGTTGGCC AGTTTGCCCC GCTCTTCCCA GGAAACAGTG CTGGCCATGG TCCAGCTTGG 33 00 

CGCCCGTGAT GGCGCCGTCC CTTCCTCCAT TCTGGAGGGC ATTG CTATG G TCGTCGAACA 3 36 0 

TATGTATACC GCCTACACTT ATGTGTACAC ACTCGGCGAT ACTGAAAGAA AATTAATGTT 342 0 

GGACATACAC ACGGTCCTCA CCGACAGCTG CCCGCCCAAA GACTCCGGAG TATCAGAAAA 34 8 0 

GCTACTGAGA ACATATTTGA TGTTCACATC AATGTGTA CC AACATAGAGC TGGGCGAAAT 3 54 0 

GATCGCCCGC TTTTCCAAAC CGGACAGCCT TAACATCTAT AGGGCATTCT CCCCCTGCTT 3 6 00 

TCTAGGACTA AGGTACGATT TGCATCCAGC CAAGTTGCGC GCCGAGGCGC CGCAGTCGTC 3 66 0 

CGCTCTGACG CGGACTGCCG TTGCCAGAGG AACATCGGGA TTCGCAGAAT TGCTCCACGC 3 72 0 

GCTGCACCTC GATAGCTTAA ATTTAATTCC GGCGATTAAC TGTTCAAAGA TTACAGCCGA 3 78 0 

CAAGATAATA GCTACGGTAC CCTTGCCTCA ' CGTCACGTAT ATCATCAGTT CCGAAGCACT 384 0 

CTCGAACGCT GTTGTCTACG AGGTGTCGGA GATCTTCCTC AAGAGTGCCA TGTTTATATC 3 9 00 

TGCTATCAAA CCCGATTGCT CCGGCTTTAA CTTTTCTCAG ATTGATAGGC A C ATT C C CAT 3 96 0 

AGTCTACAAC ATCAGCACAC CAAGAAGAGG TTGCCCCCTT TGTGACTCTG TAATCATGAG 4 02 0 

CTACGATGAG AGCGATGGCC TGCAGTCTCT CATGTATGTC ACTAATGAAA GGGTGCAGAC 4 08 0 

CAACCTCTTT TTAGATAAGT CACCTTTCTT TGATAATAAC AACCTACACA TTCATTATTT 414 0 

GTGGCTGAGG GACAACGGGA CCGTAGTGGA GATAAGGGG C ATGTATAGAA GACGCGCAGC 4 2 00 

CAGTGCTTTG TTTCTAATTC TCTCTTTTAT TGGGTTCTCG GGGGTTATCT ACTTTCTTTA 4 26 0 

CAGACTGTTT TCCATCCTTT ATTAGACGGT CAATAAAGCG TAGATTTTTA AAAGGTTTCC 4 32 0 
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TGTGCATTCT TTTTGTATGG GCATATACTT GGCAAGAAAT CCGAGCACCT CAGAAAGTGG 43 8 0 

ATTGCCGTCA CATATCAGTT CGACCACCCC TGCACCTAGC CATGCGGCGC TTTGACGGTC 444 0 

TTTGGGG CTA CACATCATAA AGTACTTTTC CATGGCTTCT ATAAGCACCT TGGAACAATC 4 5 00 

TGGGGGTTGG CGAATGGGTT CCCTAAACGG GAAATCCTCT ATGGTATTCA GGCAGAAGAC 4 56 0 

CGCGTCCTCC ACCCGACGTT TGAGTCTTTC TAG CAGAG CG CCGAAGAACT CCCGCTCGTG 4 62 0 

TGTTTTCG CA GGGGCAAGTT CTGCGCCGTA CAG CGATG AG AAACACGACA CGATGTTTTC 46 8 0 

CAGCCCCATG CTGCGCAGCA ACACGTGCTT CAGGAACAGG TGTTGTAGCC GGTTCAGTTT 4 74 0 

TAGCTTGGGT AGAAAAGTTA TCGAGTTGTT AGCACGCTCC ATGATGGTAA CGGTGTTGAA 4 8 00 

GTCACAGACC GGGCTTTCTC CGAGTCTCGG CCGCCTGAGT CCAATCATGT AGAACATAGA 4 86 0 

CGCGGCCTCG TTGTCTGTGT TAAGTGACAC GATATCCCGT TCGCAAACCT GTGCGATGTT 4 92 0 

GTGTTTCAGT ATAGATCTGG TCTGACCGGC ACGGGGTGTT ATGGGGTGAC GCGGTAAAGG 4 960 

CGACTCTGGG TCAAACACCT TTATG CGGTT GGCGGCCTCG TCGATGACGA CACGCTTGTT 504 0 

CGCGGCGTGT ATGGGGACGC GACGGCATCC CGCTGGCAGA TCTATAATCT TAAAGTTGGT 5100 

ATAAGACTGG TCGCTCGTTA TGGCCAGCCG GCACTCCGGT AGTATCTGCG TGTCCTCGAA 516 0 

TTCGTGGCCG CGTACGACTG GCTTGGAGTG CAGGTAAACG CCAAGAGATG CGGTCTCTTC 522 0 

GCCTACGCAC AAGTGGCTTC TTAACGCGTA GGGGTGCGGT GAGAG CATG A TCCGTAGCAA 528 0 

CGATAGTTCC GGGTGCCTAG CCGCGTAGAG TGGCAGGGTA GACGAGTCCG GAGTCCCAAA 534 0 

CTTTTCGAAC AACAGTGGCA TCGGGACTTC AGGATTAGAG ACTCCCACCA TGGCCGCCAC 54 0 0 

CGCCGGAGAG GTCAAGACGT GAAACACGCG CTCGCCTGTC GACAGGCGCG CCGCGCCCTC 546 0 

T AC TAG ACT A GCCTTCACGT CCGGAACTCG TAACATAGCT TAGACCAGCG GACGGACGCA 5 52 0 

ACGTACGCGG GGATCGGCTG GCGGTGTCTG CTCGTTGGAC GCGGCCGTTC GGTGGCGCCA 5 5B0 

GTGCAGGCCT AGTTTGCGAA TGGCGTGACG GACAATTTGT GG CTTTAG AG CGGCGAACCG 564 0 

ATGACCCGTG GTGGCGACGA ACGAAATGAA GTTTGCATTG CGGCCCAACT CGTCTAGCCT 5 700 

GGTCTTCTTG TTTCGGGCAT AGATTTTCGG GATTAGGTTA CACTTTTTAT ATCCCAGTAC 5 76 0 

TGCGCACTCG TGTTTG CTTT TAGTGTGACT GATTATCTTC TTTGAGAAGT CAAACAGGCC 582 0 

CCGGGCGGCG GCTCGCCTAA TGCAAGCCAC GTCAAGCCTG AGAAACGAAC AG C ATT CC AC 588 0 

CAGACACTCC AGGAACCTTT TGTG TAG CGT CTGTATTTGG GAACGGTTTC TGTG CTCAAG 5 94 0 

TAGGGAGAAT ATT C TA T T T T TGTTTCCGTC GATGCGCGCG TGCTGGTCCG TGAGAATGGG 6 00 0 

CGCCAGCTCG TGGCGAATCT GTTCCACAAG AGGCTGCCCG TACACTTTAG AAATCGTGG C 6 06 0 

TGTCGCGGCC TTAAACCAGG ACACGTTTAG CCCATCCTTG CTGGAGACCA CAGATGGAAA 612 0 

GTTTGTGGTC CAAAATACG T TTTTTCGCCC CATTCTCACC ATGTACTGGT TTTCCAGTCC 6180 

GTGCAGGTCC AACGTGGAGT TCCAATTTGC TATCGATACA GGAAATATGT GCCTGATTGG 624 0 

CAGAAAGCAT TTCAGCGTAC CCATTGCGAA GAGAAAGTGC AGCATGTCCC CACTGATGTT 6 3 00 

GATGTTTATT GCGGTGCCTT GACACATGTT GTCGGAAAAA AACACG CTTA TGGTAAAAGA 6 36 0 
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AGGTTCCTTT ACGGAGTACT TTCGTATAAC AAAATTGTTG GTCAATCTGG GGATGTTTAA 6 42 0 

AATAGTCTTT TGCAGGGTGT TAGGAACGTG GCAGCTTATC TTAGTGTTAA TCACCATGTT 64 BO 

GGTGTTGAAT ATGGTGATCT TGAAGTTTTC CAAACTGACG TGTTTTGTGG GTTCCAGCAT 654 0 

GTCTGACACT GTAGAG CTGC CCAGAGTCCG CGCGTCCGTG GCCGCGTATC GTTGGAAGCA 66 00 

CGCCTGCAAA TTTCCTTTCA TGGCTGCTCG CCGGTC T TTC GGCGCGTACC GGATTCTTGA 666 0 

AAGCGTCGCC GCCAGGAGAC GCGGTGTCTC GTGGGTGCCT AAAAAGTTTG CGCAGGGGTG 672 0 

CAGTCCGCTG CACGAGTGGC CGATGCAGTC TGCCACTGCC ATACACATGA CGAGTCTGTA 678 0 

GATGGCCGGT GTGCCCGGAT ACACTAGATA GTAGGTACAA TCTGGGGTAC TGACGACCAC 684 0 

CCTGTATGGC TTTGGTCCGG GGTCCTTGCG TTGGATTTTT ACG TG CAG AC GGGACACGAG 6 900 

CTGGTTTAGA GCCAGCTGAA AGCCCACCAG ATCCCGTCCG TTAACCTTGA CGTCCTGGTG 6960 

CTTACTCTGT TTCGACAGGT TCTTCAGCAC GGTGGGCAGT CGCTCTACGT TGTGAGCGAT 7 02 0 

GGCACGGCGC AGCGAGACCA GCTCTCCGTG CCACCCCCAC GTGGCCATGA AGCTGCTGAT 708 0 

GTTAAACTTT AAAAAATGTA GCTGTGCGTC TGGGGATGCG GGTGGCATTA TTGAAAACGA 714 0 

GAGATGCTTC AGGCTCTCCA GGAGTGCAAA ATAATTTTGA TAGATTGTGG GTTGTAGACT 7200 

ATGGGGCAAC ACCGCCAGAA ACG CATGAAA ACACTGTTCG AACTCCCAGA ACTCCAGGTA 726 0 

CCTGCACACT AT CCTGAAC A TGGCTTTGTA ACATATGGTG CACGTTAGTA GCGCGG G AAG 732 0 

ATAC AG CG AG CGTAGCTCCC TGAATTCGCA GGGTTTATCA CAATCATCGG TAAGTTCCCA 73 8 0 

TGATCCCACC GCAGGTAGGT AGTTGTCGGT GTCTATCTGT CCGCGCGTAA ACACTCCACC 74 4 0 

ACCGTCAATT ATTAAAC CTT CGCCGCTGTA CCGTCGACCC AC TTTT CCCA AAA GAG TC C C 75 00 

TTCTTGATGT ATAAAAGGGT GGAGGCGTTC CCCCAGGAGT AGTCTGCGTA TCGCTCTGCA 756 0 

GGCGAAAAAG GTGGGCTCGG GCTG CATCAT CTTATCAAGA CCTTCTAAGG TCAGCTCTGC 762 0 

CTGCAGGTGC GAGTTGGTGG CCAGACAGCA GAATATTTCC AG CTGTG ATT CCCAAGTCGC 768 0 

TTGATAACAC GTGGTCTGCG .GACTCGTCGT CAGGGAGGCG CTCGGTGGCA GTAGTAGGGG 774 0 

GCCCTCGAGC GCTGCCATGG AGGCGACCTT GGAGCAACGA CCTTTCCCGT ACCTCGCCAC 78 00 

GGAGGCCAAC CTCCTAACGC AGATTAAGGA GTCGGCTGCC G ACGG ACT CT TCAAGAGCTT 786 0 

TCAG CTATTG CTCGGCAAGG ACG C CAG AG A AGGCAGTGTC CGTTTCGAAG CGCTACTGGG 7 92 0 

CGTATATACC AATGTGGTGG AGTTTGTTAA GTTTCTGGAG ACCGCCCTCG CCGCCGCTTG 798 0 

CGTCAATACC GAGTTCAAGG ACCTG CGG AG AATGATAGAT GGAAAAATAC AGTTTAAAAT 804 0 

TTCAATGCCC ACTATTG C C C ACGGAGACGG GAGGAGGCCC AACAAGCAGA G AC AG TAT AT 8100 

CGTCATGAAG GCTTGCAATA AGCACCACAT CGGTG CGGAG ATTGAGCTTG CGGCCGCAGA 816 0 

CATC GAG CTT CTCTTCGCCG AGAAAGAGAC GCCCTTGGAC TTCACAGAGT ACGCGGGTGC 8 22 0 

CATCAAGACG ATTACGTCGG CTTTGCAGTT TGGTATGGAC G C C CTAGAAC GGGGG CTAGT 82 80 

GGACACGGTT CT CG CAG TT A AACTTCGGCA CGCTCCACCC GTCTTTATTT TAAAGACGCT 8 34 0 

GGGCGATCCC GTCTACTCTG AGAGGGG C CT CAAAAAGGCC GTCAAGTCTG ACATGGTATC 84 00 
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CATGTTCAAG GCACACCTCA TAGAACATTC ATTTTTTCTA GATAAGGCCG AGCTCATGAC 84 6 0 

AAGGGGGAAG CAGTATGTCC TAACCATGC7 CTCCGACATG CTGGCCGCGG TGTGCGAGGA 6 52 0 

TACCGTCTTT AAGGGTGTCA GCACGTACAC CACGGCCTCT GGGCAGCAGG TGGCCGGCGT 858 0 

CCTGGAGACG ACGGACAGCG TCATGAGACG GCTGATGAAC CTGCTGGGGC AAGTGGAAAG 8 64 0 

TGCCATGTCC GGGCCCGCGG CCTACGCCAG CTACGTTGTC AGGGGTGCCA ACCTCGTCAC 8 700 

CGCCGTTAGC TACGGAAGGG CGATGAGAAA CTTTGAACAG TTTATGGCAC GCATAGTGGA 8 76 0 

CCATCCCAAC GCTCTGCCGT CTGTGGAAGG TGACAAGGCC GCTCTGGCGG ACGGACACGA 8 62 0 

CG AG ATT CAG AGAACCCGCA TCGCCGCCTC TCTCGTCAAG ATAGGGGATA AGTTTGTGGC 888 0 

CATTGAAAGT TTGCAGCGCA TGTACAACGA GACTCAGTTT CCCTGCCCAC TGAACCGGCG 8 94 0 

CAT C CAG T AC ACCTATTTCT TCCCTGTTGG CCTTCACCTT CCCGTGCCCC GCTACTCGAC 9 0 00 

ATCCGTCTCA GTCAGGGGCG TAGAATCCCC GGCCATCCAG TCG AC C GAGA CGTGGGTGGT 906 0 

TAATAAAAAC AACGTGCCTC TTTGCTT CGG TTAC CAAAAC GCCCTCAAAA GCATATGCCA 912 0 

CCCTCGAATG CACAACCCCA CCCAGTCAGC CCAGGCACTA AACCAAGCTT TTCCCGATCC 918 0 

CGACGGGGGA CATGGGTACG GTCTCAGGTA TGAGCAGACG C C AAACATG A AC CT ATTC AG 924 0 

AACGTTCCAC CAGTATTACA TGGGGAAAAA CGTGGCATTT GTTCCCGATG TGGCCCAAAA 9 3 00 

AGCGCTCGTA ACCACGGAGG ATCTACTGCA CCCAACCTCT CACCGTCTCC TCAGATTGGA 93 60 

GGTCCACCCC TTCTTTGATT TTTTTGTG C A CCCCTGTCCT GG AG CG AG AG GATCGTACCG 942 0 

CGCCACCCAC AGAACAATGG TTGGAAATAT ACCACAACCG CTCGCTCCAA GGGAGTTTCA 94 8 0 

G G AAAGT AG A GGGGCGCAGT TCGACGCTGT GACGAATATG ACACACGTCA TAG A C CAG CT 9 54 0 

AACTATTGAC G T C AT AC AGG AGACGGCATT TGACCCCGCG TATCCCCTGT TCTGCTATGT 96 00 

AATCGAAGCA ATGATTCACG GACAGGAAGA AAAATTCGTG ATGAACATGC CCCTCATTGC 96 6 0 

CCTGGTCATT CAAACCTACT GGGTCAACTC GGGAAAACTG GCGTTTGTGA ACAGTTATCA 972 0 

CATGGTTAGA TTCATCTGTA CG CATATTGG GAATGGAAGC ATCCCTAAGG AGGCGCACGG 978 0 

CCACTACCGG AAAATCTTAG GCGAGCTCAT CGCCCTTGAG CAGGCGCTTC TCAAGCTCGC 984 0 

GGGACACGAG ACGGTGGGTC GGACGCCGAT CACACATCTG GTTTCGGCTC TCCTCGACCC 9900 

GCATCTGCTG CCTCCCTTTG ' CCTACCACGA TGTCTTTACG GATCTTATGC AGAAGTCATC 9 96 0 

CAGACAACCC ATAATCAAGA TCGGGGATCA AAACTACGAC AAC C CT C AAA ATAGGGCGAC 1002 0 

ATTCATCAAC CTCAGGGGTC GCATGGAGGA CCTAGTCAAT AACCTTGTTA A C ATTT AC CA 10080 

GACAAGGGTC AATG AGG AC C ATGACGAGAG ACACGTCCTG GACGTGGCGC CCCTGGACGA 1014 0 

GAATGACTAC AACCCGGTCC TCGAGAAGCT ATTCTACTAT GTTTTAATGC CGGTGTGCAG 10200 

TAACGGCCAC ATGTGCGGTA TGGGGGTCGA CT AT CAAAAC GTGGCCCTGA CGCTGACTTA 1026 0 

CAACGGCCCC GTCTTTGCGG ACGTCGTGAA CG C A C AGG AT GATATTCTAC TGCACCTGGA 10320 

GAACGGAACC TTGAAGGACA TTCTGCAGGC AGGCGACATA CGCCCGACGG TGGACATGAT 10380 

CAGGGTG CTG TGCACCTCGT TTCTGACGTG CCCTTTCGTC ACCCAGGCCG CTCGCGTGAT 10440 
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CACAAAGCGG GACCCGGCCC AGAGTTTTG C CACGCACGAA TACGGGAAGG ATGTGGCGCA 1050C 
GACCGTGCTT GTTAATGGCT TTGGTGCGTT CGCGG7GGCG GACCGCTCTC GCGAGGCGGC 10560 
GGAGACTATG TTTTAT C CGG TACCCTTTAA CAAGCTCTAC GCTGACCCGT TGGTGGCTGC 1052 0 
CACACTG CAT CCGCTCCTGC CAAACTATGT CACCAGGCTC CCCAACCAGA GAAACGCGGT 10680 
GGTCTTTAAC GTGCCATCCA ATCTCATGGC AGAATATGAG GAATGGCACA AGTCGCCCGT 10740 
CGCGGCGTAT GCCGCGTCTT GTCAGGCCAC CCCGGGCGCC ATTAGCGCCA TGG TGAG CAT 108 00 
GCACCAAAAA CTATCTGCCC CCAGTTTCAT TTG CCAGG CA AAACACCGCA TGCACCCTGG 1086 0 
TTTTGCCATG ACAGTCGTCA GGACGGACGA GGTTCTAG CA GAG CA CAT CC TATACTG CTC 10920 
CAGGG CGTCG ACATCCATGT TTGTGGGCTT GCCTTCGGTG GTACGGCGCG AGGTACGTTC 1098 0 
GGACGCGGTG ACTTTTGAAA TTACCCACGA GATCGCTTCC CTGCACACCG CACTTGGCTA 11040 
CTCATCAGTC ATCGCCCCGG CCCACGTGGC CGCCATAACT ACAGACATGG GAGTACATTG 11100 
TCAGGACCTC TTTATGATTT TCCCAGGGGA CGCGTATCAG GACCGCCAGC TGCATGACTA 1116 0 
TATCAAAATG AAAGCGGGCG TGCAAACCGG CTCACCGGGA AACAGAATGG ATCACGTGGG 1122 0 
ATACACTGCT GGGGTTCCTC GCTGCGAGAA CCTGCCCGGT TTGAGTCATG GTCAGCTGGC 11280 
AACCTGCGAG ATAATTCCCA CGCCGGTCAC ATCTGACGTT GCCTATTTCC AGACCCCCAG 1134 0 
CAACCCCCGG GGGCGTGCGG CGTCGGTCGT GTCGTGTGAT GCTTACAGTA ACGAAAGCGC 114 00 
AGAGCGTTTG TTCTACGACC ATT CAATAC C AGACCCCGCG TACGAATGCC GGTCCACCAA 114 6 0 
CAACCCGTGG GCTTCGCAGC GTGGCTCCCT CGGCGACGTG CTATACAATA TCACCTTTCG 11520 
CCAGACTGCG CTGCCGGGCA TGTACAGTCC TTGTCGGCAG TTCTTC CAC A AGG AAG A CAT 11 58 0 
TATGCGGTAC AATAGGGGGT TGTACACTTT GGTTAATGAG TATTCTGCCA GGCTTGCTGG 11640 
GGCCCCCGCC AC C AG CACT A CAGACCTCCA GTACGTCGTG GTCAACGGTA CAGACGTGTT 117 00 
TTTGGACCAG CCTTGCCATA TGCTGCAGGA GGCCTATCCC ACGCTCGCCG CCAGCCACAG 11760 
AG TTATG CTT GCCGAGTACA TGTCAAACAA G CAG ACACAC GCCCCAGTAC ACATGGGCCA 1182 0 
GTATCTCATT GAAGAGGTGG CGCCGATGAA GAGACTATTA AAGCTCGGAA ACAAGGTGGT 1188 0 
GTATTAG CT A ACCCTTCTAG CGTTGGCTAG TCATGGCACT CGACAAG AG T ATAGTG G TT A 1194 0 
ACTTCACCTC CAGACTCTTC GCTGATGAAC TGGCCGCCCT TCAGTCAAAA ATAGGG AG CG 12 000 
TACTGCCGCT CGGAGATTGC CACCGTTTAC AAAATATACA GGCATTGGGC CTGGGGTGCG 1206 0 
TATGCTCACG TGAG A CAT CT CCGGACTACA TCCAAATTAT GCAGTATCTA TCCAAGTGCA 12120 
CACTCGCTGT CCTGGAGGAG GTTCGCCCGG ACAGCCTGCG CCTAACGCGG ATGGATCCCT 1218 0 
CTGACAACCT TCAGATAAAA AACGTATATG CCCCCTTTTT TCAGTGGGAC AGCAACACCC 1224 0 
AG CT AG CAG T GCTACCCCCA TTTTTTAGCC GAAAGGATTC CACCATTGTG CTCGAATCCA 12300 
ACGGATTTGA CCCCGTGTTC CCCATGGTCG TGCCGCAGCA ACTGGGGCAC GCTATTCTGC 12360 
* AGCAGCTGTT GGTGTACCAC ATCT ACTC CA AAATATCGGC CGGGGCCCCG GATGATGTAA 12420 
ATATGGCGGA ACTTGATCTA TATACCACCA ATGTGT CATT TATGGGG CG C ACATATCGTC 124 8 0 
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CGCGCCACCC 


CACTAGGCAG 


14160 


GGCCACCGTG 


GCGTATCAGG 


TCCTTCGCAC 


CCTGGGACCG 


CAGGCCGGGT 


CACATGCACC 


14220 


GCCGACGGTG 


GG CAT AG CTA 


CCCAGGAGCC 


CTACCGTACA 


ATATACATGC 


CAGATTAGAA 


14280 


CGGGGTGTGT 


GCTATAATGG 


ATGGCTATGG 


GGGGGGGCTG 


TAGATAATTG 


AGCGCTGTGC 


14340 


TTTTATTGTG 


GGGATATGGG 


CTTGTACATG 


TGTCTATCAT 


CGGTAGCCAT 


AAAATGGG CC 


14400 


ATGACAACTG 


C CACAAGTAA 


GTCGTCCGAC 


ATGTGLTITT 


GCTTGGCGCT 


GTATGACTGC 


14460 


CCTCCATCCC 


T AAG CGGGAC 


GCACTTGATC 


GCGCGGACCT 


GTTCTACCAG 


GTAGGTCACC 


14520 
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GGGTCAAATG ATATTTTGAT GGTGTTGGAC ACCACCGTCT GGCTGGCGCT CAGGGTGCCG 14 58 0 
GAGTTCAGAG CGTAGATGAA TGTCTCAAAC GCGGAGGATT TCTCGCCTCC CAACATGTAA 14 64 0 
ATTGGCCACT GCAGGGCGCT GCTCTTGTCA GTATAG TGTA GAAAATGTAT GGGGAG CGGG 14 700 
CATATTTCGT TAAGGACGGT TGCAATGGCC ACCCCAGAAT CTTGGCTGCT GTTGCCTTCG 14 76 0 
ACCGCCGCGT TCACGCGCTC AATTGTGGTG TGGAGCACAG CGATCGCCTT AATCATCGTG 14 82 0 
CATGCGCAGG ACG CTATCTC GTAAGCAGCT GCGCCAGTGA GGTCGCGCAG GAAGAAATGC 14 88 0 
TCCATGCCCA ATATGAGGCT TCTGGTGGGA GTCTGAG TAC TCGTGACAAC GGCGCCCACG 14 940 
CCAGTACCGG ACGCCTCCGT GTTGTTCGTA TACGCGGGGT CGATGTAAAC AAACAGCTGT 15000 
TTTCCAAGGC ACTTCTGAAC CTCCTGGGCG GTGGTGTCTA CCCGACACAT GTCAAACTGT 15 06 0 
GTCAGCGCTG CGTCACCCAC CACGCGGTAA AGCGTAGCAT TTGACGACGC TGCTCCCTCG 1512 0 
CCCATTAGTT CGGTGTCGAA TGCCCCCTCC ATAAAGAGGT TGGTGGTGGT TTTGATGGAT 1518 0 
TCGTCGATGG TGATGTACGT CGGAATGTGC AGTCTGTAAC AAGGACAGGA CACTAGTGCG 15240 
TCTTGCAGGT GGAAATCTTC TCGGTGGTCC GCACACACGT AACTGACCAC ATT CAG CAT C 153 00 
TTTTCCTGGG CGTTCCTGAG GTTAAGCAGG AAACTCGTGG AG CGGTCTG A CGAGTTCACG 15360 
GATGATATAA AT ATAAG CTT GGCGTCTTTC TGAAGCATGA AACCCAGAAT AGCCGGCAGT 15420 
G CAT C CTTTT TAATAAAATT CGCCTCGTCT ACGT AG AG CA GGTTAAAGGT CTGTCCCCGA 15480 
ATGCTCTGCA GACACGGAAA GACACAAAAG AGGGGCTCAT AAGCGGCTAA CAG T AAA GG A 15540 
GAGGAGGCGA ACAGTGCGTG GCTCTTGGTT CTTGGGAATA AAAGGGGGCG TGTGTGCCGA 156 00 
TCGATCGTAT GGGTGAGCCA GTGGATCCTG GACATGTGGT GAATGAGAAA GATTTTGAGG 1566 0 
AGTGTGAACA ATTTTTCAGT CAACCCCTTA GGGAG CAAGT GGTCGCGGGG GTCAGGGCAC 15720 
TCGACGGCCT CGGTCTCGCT GACTCTCTAT GTCACAAAAC AGAAAG ACT C TGCCTGCTGA 15 780 
TGGACCTGGT GGGCACGGAG TGCTTTGCGA GGGTGTGCCG CCTAGACACC GGTG CGAAAT 1584 0 
GAAGAGTGTG GCGAGTCCCT TATGTCAGTT CCACGGCGTG TTTTGCCTGT ACCAGTGTCG 15 900 
CCAGTGCCTG GCATACCACG TGTGTGATGG GGGCGCCGAA TGCGTTCTCC TGCATACGCC 15 960 
GGAGAGCGTC ATCTGCGAAC TAACGGGTAA CTGCATGCTC GGCAACATTC AAGAGGGCCA 16 020 
GTTTTTAGGG CCGGTACCGT ATCGGACTTT GGATAACCAG GTTGACAGGG ACGCATATCA 16 080 
CGGGATG CTA GCGTGTCTGA AACGGGACAT TGTGCGGTAT TTG CAGACAT GGCCGGACAC 16140 
CACCGTAATC GTGCAGGAAA TAGCCCTGGG GGACGGCGTC ACCGACACCA TCTCGGCCAT 16200 
TATAGATGAA ACATTCGGTG AGTGTCTTCC CGTACTGGGG GAGGCCCAAG GCGGGTACGC 16260 
CCTGGTCTGT AGCATGTATC TGCACGTTAT CGTCTCCATC TATTCGACAA AAACGGTGTA 16 32 0 
CAACAGTATG CTATTTAAAT GCACAAAGAA TAAAAAGTAC GACTGCATTG CCAAGCGGGT 163 8 0 
GCGGACAAAA TGGATGCGCA TGCTATCAAC GAAAG AT ACG TAGGTCCTCG CTGCCACCG7 1644 0 
TTGGCCCACG TGGTGCTGCC TAGGACCTTT CTG CTG CATC ACGCCATACC CCTGGAGCCC 16500 
G AGAT CAT CT TTTCCACCTA CACCCGGTTC AGCCGGTCGC CAGGGTCATC CCGCCGGTTG 16560 
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GTGGTGTGTG GGAAACGTGT CCTGCCAGGG GAGGAAAACC AACTTGCGTC TTCACCTTCT 

GGTTTGGCGC TTAGCCTGCC 7CTGTTTTCC CACGATGGGA ACTTTCATCC ATTTGACATC I5 68C 

TCGGTACTGC GCATTTCCTG CCCTGGTTCT AATCTTAGTC TTACTGTCAG ATTTCTC7AT 16 740 

CTATCTCTGG TGGTGGCTAT GGGGGCGGGA CGGAATAATG CGCGGAGTCC GACCGTTGAC 16 800 

GGGGTATCGC CGCCAGAGGG CGCCGTAGCC CACCCTTTGG AGGAACTGCA GAGGCTGGCG 16 86 0 

CGTGCTACGC CGGACCCGGC ACTCACCCGT GGACCGTTGC AGGTCC'TGAC CGGCCTTCTC 16 92 0 

CGCGCAGGGT CAGACGGAGA CCGCGCCACT CACCACATGG CGCTCGAGGC TCCGGGAACC 16 980 

GTGCGTGGAG AAAGCCTAGA CCCGCCTGTT TGACAGAAGG GGCCAGCGCG CACACGCCAC 17 040 

AGGCCACCCC CCGTGCGACT GAGCTTCAAC CCCGTCAATG CCGATGTACC CGCTACCTGG 1710 0 

CGAGACGCCA CTAACGTGTA CTCGGGTGCT CCCTACTATG TGTGTGTTTA CGAACGCGGT 1716 0 

GGCCGTCAGG AAGACGACTG GCTGCCGATA CCACTGAGCT TCCCAGAAGA GCCCGTGCCC 1722 0 

CCGCCACCGG G CTTAGTGTT CATGGACGAC TTG TTCATTA ACACGAAGCA GTGCGACTTT 17280 

GTGGACACGC TAGAGGCCGC CTGTCGCACG CAAGG CTACA CGTTGAGACA GCGCGTGCCT 17340 

GTCGCCATTC CTCGCGACGC GGAAATCGCA GACGCAGTTA AATCGCACTT TTTAGAGGCG 17400 

TGCCTAGTGT TACGGGGGCT GGCTTCGGAG GCTAGTGCCT GGATAAGAGC TGCCACGTCC 174 60 

CCGCCCCTTG GCCGCCACGC CTGCTGGATG GACGTGTTAG GATTATGGGA AAGCCGCCCC 1752 0 

CACACTCTAG GTTTGGAGTT ACGCGGCGTA AACTGTGGCG GCACGGACGG TGACTGGTTA 175 8 0 

GAGATTTTAA AAC AG C C CG A TGTGCAAAAG ACAGTCAGCG GGAGTCTTGT GGCATGCGTG 17640 

ATCGTCACAC CCGCATTGGA AGCCTGGCTT GTGTTACCTG GGGGTTTTGC TATTAAAGCC 177 00 

CG CT AT AG GG CGTCGAAGGA GGATCTGGTG TTCATTCGAG GCCGCTATGG CTAGCCGGAG 17 760 

GCGCAAACTT CGGAATTTCC TAAACAAGGA ATGCATATGG ACTG TT AAC C CAATG T C AG G 17 82 0 

GG AC CAT AT C AAGGTCTTTA ACGCCTGCAC CTCTATCTCG CCGGTGTATG ACCCTGAGCT 17880 

GGTAACCAGC TACGCACTGA GCGTGCCTGC TTACAATGTG TCTGTGGCTA TCTTGCTGCA 17 940 

TAAAGTCATG GGACCGTGTG TGGCTGTGGG AATTAACGGA GAAATGATCA TGTACGTCGT 18 000 

AAGCCAGTGT GTTTCTGTGC GGCCCGTCCC GGGGCGCGAT GGTATGG CGC TCATCTACTT 18 060 

TGGACAGTTT CTGGAGGAAG CATCCGGACT GAGATTTCCC TACATTGCTC CGCCGCCGTC 1812 0 

GCGCGAACAC GTACCTGACC TGACCAGACA AG AATTAG TT CATACCTCCC AGGTGGTGCG 1818 0 

CCGCGGCGAC CTGACCAATT GCACTATGGG TCTCGAATTC AGGAATGTGA ACCCTTTTGT 1824 0 

TTGGCTCGGG GGCGGATCGG TGTGGCTGCT GTTCTTGGGC GTGGACTACA TGGCGTTCTG 18 3 00 

TCCGGGTGTC GACGGAATGC CGTCGTTGGC AAGAGTGGCC GCCCTGCTTA CCAGGTGCGA 18 36 0 

CCACCCAGAC TGTGTCCACT G CCATGG ACT CCGTGGACAC GTTAATGTAT TTCGTGGGTA 18420 

CTGTTCTGCG CAGTCGCCGG GTCTATCTAA CATCTGTCCC TGTATCAAAT CATGTGGGAC 18480 

CGGGAATGGA GTGACTAGGG TCACTGGAAA CAGAAATTTT CTGGGTCTTC TGTTCGATCC 18 540 

CATTGTCCAG AGCAGGGTAA CAGCTCTGAA GATAACTAGC CACCCAACCC CCACGCACGT 18600 
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CGAGAATGTG CTAACAGGAG TGCTCGACGA CGGCACCTTG GTGCCGTCCG TCCAAGGCAC 18 660 
CCTGGGTCCT CTTACGAATG TCTGACTACT TCAGCCGCTT GCTGATATA7 GAGTGTAAAA 0 
AACTTAAGGC CCTGGGCTTA CGTTCTTATT GAAGCATGTT GCGCACATCA GCGAGCTGGA 28780 
CCGTCCTCCG GGTCGCGTGT AGATTATGGT TCCGTTCTCC TTCTTGATGT TTAAATTTTT 18 84 0 
GGGGGGGAAC CACCGACAAA GCGTCTTTAT GATTTCCGCG AACACGGAGT TGGCTACGTG 18 900 
CTTTTGGTGG GCTACGTACC CAATGTTAAT GTTCTCTACG GATGCCAGTA GCATGCTGAT 18 960 
GATCGCCACC ACTATCCATG TCTTTCCGTG TCTCCTTGGT ATTAGGAATA CGCTTGCCTT 19020 
TTG CTTAAAC GTCTGTAAAA CACTGTTTGG AGTTTCAAAT AAACCGAAGT ACTGCTTAAA 19080 
CAATCCAAAC AACTGGTGCG TCTTTTGTGG GGCCTTGATT GAAACCAAAA AGAAAAAAGT 1914 0 
G TG CATT ACT AGCTGCTGTT GGAAGGGCTC CAGCCAGTGC ACCCCGGGAA CGTAACAGCC 192 00 
G TTCAG AAAG GACGAAAGGT TAAC CAG AAA AG C CTGAAGT TCGCGGTAGA CAGAGCAGGC 19260 
GTGCAGGGAG TCGTGTGTTT TTCTGCCCGC CTGGTACTCG ACCAGTTGAT CGGCCGTGG^ 19320 
GACGTGCGCG TCCTCGCGCA CACACCGCAT CTGCAAGTAT GTTGATAGGG ACTCCAATAG 19380 
GCGCGGCTTT GCGGGGACGT TGTCCTCGGA CGGTCTGGGG GTTCCCACGT CGGGATTTGC 19440 
TGACGTGGGC GTGGCGGGAT GGTGCCGTGT GCAGTATGTT TCCAGGACCG AACTGTATGA 19500 
GTTTATTCTG TGCACCACGC CAATAAAAGG GTGCGCCATC CGTGCCGTTT TGGGACAGTG 1956 0 
TCGCGTGAAT GTCGGGGCAC TCAGTTCCCA CCTCTCTCCG GCGTCTTTGG CGGTCTCCTC 1962 0 
CAGGTTGGCG GCAAGGCGCT CCCTGTGACG GCTGAGCAGC ATGTTTGCTT TGAG CTCG CT 196 8 0 
CGTGTCCGAG GGTGACCCGG AGGTGACCAG TAGGTACGTC AAGGGCGTAC AACTTGCCCT 2 9 740 
GGACCTTAGC GAGAACACAC CTGGACAATT TAAGTTGATA GAAACTCCCC TGAACAGCTT 198 00 
CCTCTTGGTT TCCAACGTGA TGCCCGAGGT CCAGCCAATC TGCAGTGGCC GGCCGGCCTT 19860 
GCGGCCAGAC TTTAGTAATC TCCACTTGCC TAGACTGGAG AAGCTCCAGA GAGTCCTCGG 1992 0 
GCAGGGTTTC GGGGCGGCGG GTGAGGAAAT CGCACTGGAC CCGTCTCACG TAG AAA CA CA 1998 0 
CGAAAAGGGC CAGGTGTTCT ACAACCACTA TGCTACCGAG GAGTGGACGT GGGCTTTGAC 2 0 040 
TCTGAATAAG GATGCGCTCC TTCGGGAGGC TGTAGATGGC CTGTGTGACC CCGGAACTTG 2 0100 
GAAGGGTCTT CTTCCTGACG ACCCCCTTCC GTTGCTATGG CTGCTGTTCA ACGGACCCGC 2 016 0 
CTCTTTTTGT CGGGCCGACT GTTGCCTGTA CAAGCAGCAC TGCGGTTACC CGGGCCCGGT 2 0220 
GCTACTTCCA GGTCACATGT ACGCTCCCAA ACGGGATCTT TTGTCGTTCG TTAATCATGC 2 028 0 
CCTGAAGTAC ACCAAGTTTC TATACGGAGA TTTTTCCGGG ACATGGGCGG CGGCTTGCCG 2 034 0 
CCCGCCATTC GCTACTTCTC GGATACAAAG GGTAGTGAGT CAGATGAAAA TCATAGATGC 2 0400 
TTCCGACACT TACATTTCCC ACACCTGCCT CTTGTGTCAC AT AT AT CAG C AAAATAG CAT 2 0460 
AATTG CGGGT CAGGGGACCC ACGTGGGTGG AATCCTACTG TTGAGTGGAA AAGGGACCCA 2 0 520 
GTATATAACA GGCAATGTTC AGACCCAAAG GTGTCCAACT ACGGG CG ACT ATCTAATCAT 2 0580 
CCCATCGTAT GACATACCGG CGATCATCAC CATGATCAAG GAGAATGGAC TCAACCAACT 2 0640 
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CTAAAAGAGA GTTTATTAAG TCGGCTCTGG AGGCCAACAT CAACAGGAGG GCAGCTGTAT 2 0 "DC 



CGT TTC GAA GCG CTA CTG GGC GTA TAT ACC AAT GTG GTG GAG TTT GTT 

Arg Phe Glu Ala Leu Leu Gly Val Tyr Thr Asn Val Val Glu Phe Val 

50 55 60 

AAG TTT CTG GAG ACC GCC CTC GCC GCC GCT TGC GTC AAT ACC GAG TTC 

Lys Phe Leu Glu Thr Ala Leu Ala Ala Ala Cys Val Asn Thr Glu Phe 

65 70 75 80 



CGCTATTTGA 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: N 
Civ) ANTI-SENSE: N 

(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 1..4131 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

ATG GAG GCG ACC TTG GAG CAA CGA CCT TTC CCG TAC CTC GCC ACG GAG 4 8 

Me^ Glu Ala Thr Leu Glu Gin Arg Pro Phe Pro Tyr Leu Ala Thr Glu 
1 5 10 15 

GCC AAC CTC CTA ACG CAG ATT AAG GAG TCG GCT GCC GAC GGA CTC TTC 96 
Ala Asn Leu Leu Thr Gin He Lys Glu Ser Ala Ala Asp Gly Leu Phe 
20 25 30 

AAG AGC TTT CAG CTA TTG CTC GGC AAG GAC GCC AGA GAA GGC AGT GTC 14 4 

Lys Ser Phe Gin Leu Leu Leu Gly Lys Asp Ala Arg Glu Gly Ser Val 
35 40 45 



192 



240 



AAG GAC CTG CGG AGA ATG ATA GAT GGA AAA ATA CAG TTT AAA ATT TCA 2 88 

Lys Asp Leu Arg Arg Met He Asp Gly Lys He Gin Phe Lys He Ser 
85 90 95 

ATG CCC ACT ATT GCC CAC GGA GAC GGG AGG AGG CCC AAC AAG CAG AGA 3 36 

Met Pro Thr He Ala His Gly Asp Gly Arg Arg Pro Asn- Lys Gin Arg 
100 105 HO 

CAG TAT ATC GTC ATG AAG GCT TGC AAT AAG CAC CAC ATC GGT GCG GAG 3 84 

Gin Tvr He Val Met Lys Ala Cys Asn Lys His His He Gly Ala Glu 

115 120 125 

ATT GAG CTT GCG GCC GCA GAC ATC GAG CTT CTC TTC GCC GAG AAA GAG 4 32 

He Glu Leu Ala Ala Ala Asp He Glu Leu Leu Phe Ala Glu Lys Glu 

130 135 140 

ACG CCC TTG GAC TTC ACA GAG TAC GCG GGT GCC ATC AAG ACG ATT ACG 4 80 

Thr Pro Leu Asp Phe Thr Glu Tyr Ala Gly Ala He Lys Thr He Thr 

145 150 155 160 
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TCG GCT TTG CAG TTT GGT ATG GAC GCC CTA GAA CGG GGG CTA GTG GAC 52 8 

Ser Ala Leu Gin Phe Gly Met Asp Ala Leu Glu Arg Gly Leu Val Asp 
165 170 175 

ACG GTT CTC GCA GTT AAA CTT CGG CAC GCT CCA CCC GTC TTT ATT TTA 5 76 

Thr Val Leu Ala Val Lys Leu Arg His Ala Pro Pro Val Phe He Leu 
180 185 190 

AAG ACG CTG GGC GAT CCC GTC TAC TCT GAG AGG GGC CTC AAA AAG GCC 624 
Lys Thr Leu Gly Asp Pro Val Tyr Ser Glu Arg Gly Leu Lys Lys Ala 
195 200 205 

GTC AAG TCT GAC ATG GTA TCC ATG TTC AAG GCA CAC CTC ATA GAA CAT 6 72 

Val Lys Ser Asp Met Val Ser Met Phe Lys Ala His Leu He Glu His 
210 215 220 

TCA TTT TTT CTA GAT AAG GCC GAG CTC ATG ACA AGG GGG AAG CAG TAT 72 0 

Ser Phe Phe Leu Asp Lys Ala Glu Leu Met Thr Arg Gly Lys Gin Tyr 
225 230 235 240 

GTC CTA ACC ATG CTC TCC GAC ATG CTG GCC GCG GTG TGC GAG GAT ACC 768 
Val Leu Thr Met Leu Ser Asp Met Leu Ala Ala Val Cys Glu Asp Thr 
245 250 255 

GTC TTT AAG GGT GTC AGC ACG TAC ACC ACG GCC TCT GGG CAG CAG GTG 816 
Val Phe Lys Gly Val Ser Thr Tyr Thr Thr Ala Ser Gly Gin Gin Val 
260 265 270 

GCC GGC GTC CTG GAG ACG ACG GAC AGC GTC ATG AGA CGG CTG ATG AAC 864 
Ala Gly Val Leu Glu Thr Thr Asp Ser Val Met Arg Arg Leu Met Asn 
275 280 285 

CTG CTG GGG CAA GTG GAA AGT GCC ATG TCC GGG CCC GCG GCC TAC GCC 912 
Leu Leu Gly Gin Val Glu Ser Ala Met Ser Gly Pro Ala Ala Tyr Ala 
290 295 300 

AGC TAC GTT GTC AGG GGT GCC AAC CTC GTC ACC GCC GTT AGC TAC GGA 96 0 

Ser Tyr Val Val Arg Gly Ala Asn Leu Val Thr Ala Val Ser Tyr Glv 
305 310 315 320 

AGG GCG ATG AGA AAC TTT GAA CAG TTT ATG GCA CGC ATA GTG GAC CAT 1008 
Arg Ala Met Arg Asn Phe Glu Gin Phe Met Ala Arg He Val Asd His 
325 330 335 

CCC AAC GCT CTG CCG TCT GTG GAA GGT GAC AAG GCC GCT CTG GCG GAC 1056 
Pro Asn Ala Leu Pro Ser Val Glu Gly Asp Lys Ala Ala Leu Ala Asp 
340 345 350 

GGA CAC GAC GAG ATT CAG AGA ACC CGC ATC GCC GCC TCT CTC GTC AAG 1104 
Gly His Asp Glu He Gin Arg Thr Arg He Ala Ala Ser Leu Val Lys 
355 360 365 

ATA GGG GAT AAG TTT GTG GCC ATT GAA AGT TTG CAG CGC ATG TAC AAC 1152 
He Gly Asp Lys Phe Val Ala He Glu Ser Leu Gin Arg Met Tyr Asn 
370 375 380 

GAG ACT CAG TTT CCC TGC CCA CTG AAC CGG CGC ATC CAG TAC ACC TAT 12 00 

Glu Thr Gin Phe Pro Cys Pro Leu Asn Arg Arg He Gin Tyr Thr Tvr 
385 390 395 400 

TTC TTC CCT GTT GGC CTT CAC CTT CCC GTG CCC CGC TAC TCG ACA TCC 124 B 

Phe Phe Pro Val Gly Leu His Leu Pro Val Pro Arg Tyr Ser Thr Ser 
405 410 415 

GTC TCA GTC AGG GGC GTA GAA TCC CCG GCC ATC CAG TCG ACC GAG ACG 12 96 

Val Ser Val Arg Gly Val Glu Ser Pro Ala He Gin Ser Thr Glu Thr 
420 425 430 
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TGG GTG GTT AAT AAA AAC AAC GTG CCT CTT TGC TTC GGT TAC CAA AAC 1344 

Trp Val Val Asn Lys Asn Asn Val Pro Leu Cys Phe Gly Tyr Gin Asn 
435 440 445 

GCC CTC AAA AGC ATA TGC CAC CCT CGA ATG CAC AAC CCC ACC CAG TCA 13 92 

Ala Leu Lys Ser lie Cys His Pro Arg Met His Asn Pro Thr Gin Ser 
450 455 460 

GCC CAG GCA CTA AAC CAA GCT TTT CCC GAT CCC GAC GGG GGA CAT GGG 144 0 

Ala Gin Ala Leu Asn Gin Ala Phe Pro Asp Pro Asp Gly Gly His Gly 
465 470 475 460 

TAC GGT CTC AGG TAT GAG CAG ACG CCA AAC ATG AAC CTA TTC AGA ACG 14 88 

Tyr Gly Leu Arg Tyr Glu Gin Thr Pro Asn Met Asn Leu Phe Arg Thr 
485 490 495 

TTC CAC CAG TAT TAC ATG GGG AAA AAC GTG GCA TTT GTT CCC GAT GTG 15 36 

Phe His Gin Tyr Tyr Met Gly Lys Asn Val Ala Phe Val Pro Asp Val 
500 505 510 

GCC CAA AAA GCG CTC GTA ACC ACG GAG GAT CTA CTG CAC CCA ACC TCT 1584 
Ala Gin Lys Ala Leu Val Thr Thr Glu Asp Leu Leu His Pro Thr Ser 
515 520 525 

CAC CGT CTC CTC AGA TTG GAG GTC CAC CCC TTC TTT GAT TTT TTT GTG 1632 
His Arg Leu Leu Arg Leu Glu Val His Pro Phe Phe Asp Phe Phe Val 
530 535 540 

CAC CCC TGT CCT GGA GCG AGA GGA TCG TAC CGC GCC ACC CAC AGA ACA 16 8 0 

His Pro Cys Pro Gly Ala Arg Gly Ser Tyr Arg Ala Thr His Arg Thr 
545 550 555 560 

ATG GTT GGA AAT ATA CCA CAA CCG CTC GCT CCA AGG GAG TTT CAG GAA 172 8 

Met Val Gly Asn lie Pro Gin Pro Leu Ala Pro Arg Glu Phe Gin Glu 
565 570 575 

AGT AGA GGG GCG CAG TTC GAC GCT GTG ACG AAT ATG ACA CAC GTC ATA 177 6 

Ser Arg Gly Ala Gin Phe Asp Ala Val Thr Asn Met Thr His Val lie 
580 585 590 

GAC CAG CTA ACT ATT GAC GTC ATA CAG GAG ACG GCA TTT GAC CCC GCG 1824 
Asp Gin Leu Thr lie Asp Val lie Gin Glu Thr Ala Phe Asp Pro Ala 
595 600 605 

TAT CCC CTG TTC TGC TAT GTA ATC GAA GCA ATG ATT CAC GGA CAG GAA 18 72 

Tyr Pro Leu Phe Cys Tyr Val lie Glu Ala Met lie His Gly Gin Glu 
610 615 620 

GAA AAA TTC GTG ATG AAC ATG CCC CTC ATT GCC CTG GTC ATT CAA ACC 192 0 

Glu Lys Phe Val Met Asn Met Pro Leu lie Ala Leu Val lie Gin Thr 
625 630 635 640 

TAC TGG GTC AAC TCG GGA AAA CTG GCG TTT GTG AAC AGT TAT CAC ATG 1968 
Tyr Trp Val Asn Ser Gly Lys Leu Ala Phe Val Asn Ser Tyr His Met 
645 650 655 

GTT AGA TTC ATC TGT ACG CAT ATT GGG AAT GGA AGC ATC CCT AAG GAG 2 016 

Val Arg Phe lie Cys Thr His lie Gly Asn Gly Ser lie Pro Lys Glu 
660 665 670 

GCG CAC GGC CAC TAC CGG AAA ATC TTA GGC GAG CTC ATC GCC CTT GAG 2 064 

Ala His Gly His Tyr Arg Lys lie Leu Gly Glu Leu lie Ala Leu Glu 
675 680 685 

CAG GCG CTT CTC AAG CTC GCG GGA CAC GAG ACG GTG GGT CGG ACG CCG 2112 
Gin Ala Leu Leu Lys Leu Ala Gly His Glu Thr Val Gly Arg Thr Pro 
690 695 700 
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ATC ACA CAT CTG GTT TCG GCT CTC CTC GAC CCG CAT CTG CTG CCT CCC 216 0 

lie Thr His Leu Val Ser Ala Leu Leu Asp Pro His Leu Leu Pro Pro 
705 710 715 720 

TTT GCC TAC CAC GAT GTC TTT ACG GAT CTT ATG CAG AAG TCA TCC ASA 22 06 

Phe Ala Tyr His Asp Val Phe Thr Asp Leu Met Gin Lys Ser Ser Arg 
725 730 735 

CAA CCC ATA ATC AAG ATC GGG GAT CAA AAC TAC GAC AAC CCT CAA AAT 22 56 

Gin Pro lie lie Lys lie Gly Asp Gin Asn Tyr Asp Asn Pro Gin Asn 
740 745 750 

AGG GCG ACA TTC ATC AAC CTC AGG GGT CGC ATG GAG GAC CTA GTC AAT 23 04 

Arg Ala Thr Phe lie Asn Leu Arg Gly Arg Met Glu Asp Leu Val Asn 
755 760 765 

AAC CTT GTT AAC ATT TAC CAG ACA AGG GTC AAT GAG GAC CAT GAC GAG 2 3 52 

Asn Leu Val Asn lie Tyr Gin Thr Arg Val Asn Glu Asp His Asp Glu 
770 775 780 

AGA CAC GTC CTG GAC GTG GCG CCC CTG GAC GAG AAT GAC TAC AAC CCG 24 0 0 

Arg His Val Leu Asp Val Ala Pro Leu Asp Glu Asn Asp Tyr Asn Pro 
785 790 795 800 

GTC CTC GAG AAG CTA TTC TAC TAT GTT TTA ATG CCG GTG TGC AGT AAC 244 8 

Val Leu Glu Lys Leu Phe Tyr Tyr Val Leu Met Pro Val Cys Ser Asn 
805 810 815 

GGC CAC ATG TGC GGT ATG GGG GTC GAC TAT CAA AAC GTG GCC CTG ACG 24 96 

Gly His Met Cys Gly Met Gly Val Asp Tyr Gin Asn Val Ala Leu Thr 
820 825 830 

CTG ACT TAC AAC GGC CCC GTC TTT GCG GAC GTC GTG AAC GCA CAG GAT 2 54 4 

Leu Thr Tyr Asn Gly Pro Val Phe Ala Asp Val Val Asn Ala Gin Asp 
835 840 845 

GAT ATT CTA CTG CAC CTG GAG AAC GGA ACC TTG AAG GAC ATT CTG CAG 2 5 92 

Asp lie Leu Leu His Leu Glu Asn Gly Thr Leu Lys Asp He Leu Gin 
850 855 860 

GCA GGC GAC ATA CGC CCG ACG GTG GAC ATG ATC AGG GTG CTG TGC ACC 264 0 

Ala Gly Asp He Arg Pro Thr Val Asp Met He Arg Val Leu Cys Thr 
66 5 870 875 880 

TCG TTT CTG ACG TGC CCT TTC GTC ACC CAG GCC GCT CGC GTG ATC ACA 26 8 8 

Ser Phe Leu Thr Cys Pro Phe Val Thr Gin Ala Ala Arg Val He Thr 
885 890 895 

AAG CGG GAC CCG GCC CAG AGT TTT GCC ACG CAC GAA TAC GGG AAG GAT 2 73 6 

Lys Arg Asp Pro Ala Gin Ser Phe Ala Thr His Glu Tyr Gly Lys Asp 
900 905 910 

GTG GCG CAG ACC GTG CTT GTT AAT GGC TTT GGT GCG TTC GCG GTG GCG 2 7 84 

Val Ala Gin Thr Val Leu Val Asn Gly Phe Gly Ala Phe Ala Val Ala 
915 920 925 

GAC CGC TCT CGC GAG GCG GCG GAG ACT ATG TTT TAT CCG GTA CCC TTT 2832 
Asp Arg Ser Arg Glu Ala Ala Glu Thr Met Phe Tyr Pro Val Pro Phe 
930 935 940 

AAC AAG CTC TAC GCT GAC CCG TTG GTG GCT GCC ACA CTG CAT CCG CTC 2880 
Asn Lys Leu Tyr Ala Asp Pro Leu Val Ala Ala Thr Leu His Pro Leu 
945 950 955 960 

CTG CCA AAC TAT GTC ACC AGG CTC CCC AAC CAG AGA AAC GCG GTG GTC 2 92 B 

Leu Pro Asn Tyr Val Thr Arg Leu Pro Asn Gin Arg Asn Ala Val Val 
965 970 97S 
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TTT AAC GTG CCA TCC AAT CTC ATG GCA GAA TAT GAG GAA TGG CAC AAG 2 9 76 

Phe Asn Val Pro Ser Asn Leu Met Ala Glu Tyr Glu Glu Trp His Lys 
980 965 950 

TCG CCC GTC GCG GCG TAT GCC GCG TCT TGT CAG GCC ACC CCG GGC GCC 3 024 

Ser Pro Val Ala Ala Tyr Ala Ala Ser Cys Gin Ala Thr Pro Gly Ala 
995 1000 1005 

ATT AGC GCC ATG GTG AGC ATG CAC CAA AAA CTA TCT GCC CCC AGT TTC 3 072 

lie Ser Ala Met Val Ser Met His Gin Lys Leu Ser Ala Pro Ser Phe 
1010 1015 1020 

ATT TGC CAG GCA AAA CAC CGC ATG CAC CCT GGT TTT GCC ATG ACA GTC 312 0 

lie Cys Gin Ala Lys His Arg Met His Pro Gly Phe Ala Met Thr Val 
1025 1030 1035 1040 

GTC AGG ACG GAC GAG GTT CTA GCA GAG CAC ATC CTA TAC TGC TCC AGG 316 8 

Val Arg Thr Asp Glu Val Leu Ala Glu His lie Leu Tyr Cys Ser Arg 
1045 1050 1055 

GCG TCG ACA TCC ATG TTT GTG GGC TTG CCT TCG GTG GTA CGG CGC GAG 3 216 

Ala Ser Thr Ser Met Phe Val Gly Leu Pro Ser Val Val Arg Arg Glu 
1060 1065 1070 

GTA CGT TCG GAC GCG GTG ACT TTT GAA ATT ACC CAC GAG ATC GCT TCC 3 2 64 

Val Arg Ser Asp Ala Val Thr Phe Glu lie Thr His Glu lie Ala Ser 
1075 1080 1085 

CTG CAC ACC GCA CTT GGC TAC TCA TCA GTC ATC GCC CCG GCC CAC GTG 3 312 

Leu His Thr Ala Leu Gly Tyr Ser Ser Val lie Ala Pro Ala His Val 
1090 1095 1100 

GCC GCC ATA ACT ACA GAC ATG GGA GTA CAT TGT CAG GAC CTC TTT ATG 3 360 

Ala Ala lie Thr Thr Asp Met Gly Val His Cys Gin Asp Leu Phe Met 
1105 1110 1115 1120 

ATT TTC CCA GGG GAC GCG TAT CAG GAC CGC CAG CTG CAT GAC TAT ATC 34 0 8 

lie Phe Pro Gly Asp Ala Tyr Gin Asp Arg Gin Leu His Asp Tyr lie 
1125 1130 1135 

AAA ATG AAA GCG GGC GTG CAA ACC GGC TCA CCG GGA AAC AGA ATG GAT 34 5 6 

Lys Met Lys Ala Gly Val Gin Thr Gly Ser Pro Gly Asn Arg Met Asp 
1140 1145 1150 

CAC GTG GGA TAC ACT GCT GGG GTT CCT CGC TGC GAG AAC CTG CCC GGT 3 504 

His Val Gly Tyr Thr Ala Gly Val Pro Arg Cys Glu Asn Leu Pro Gly 
1155 * 1160 1165 

TTG AGT CAT GGT CAG CTG GCA ACC TGC GAG ATA ATT CCC ACG CCG GTC 3 5 52 

Leu Ser His Gly Gin Leu Ala Thr Cys Glu lie lie Pro Thr Pro Val 
1170 1175 1180 

ACA TCT GAC GTT GCC TAT TTC CAG ACC CCC AGC AAC CCC CGG GGG CGT 3 6 00 

Thr Ser Asp Val Ala Tyr Phe Gin Thr Pro Ser Asn Pro Arg Gly Arg 
1185 1190 1195 1200 

GCG GCG TCG GTC GTG TCG TGT GAT GCT TAC AGT AAC GAA AGC GCA GAG .364 8 

Ala Ala Ser Val Val Ser Cys Asp Ala Tyr Ser Asn Glu Ser Ala Glu 
1205 1210 1215 

CGT TTG TTC TAC GAC CAT TCA ATA CCA GAC CCC GCG TAC GAA TGC CGG 3 6 96 

Arg Leu Phe Tyr Asp His Ser lie Pro Asp Pro Ala Tyr Glu Cys Arg 
1220 1225 1230 

TCC ACC AAC AAC CCG TGG GCT TCG CAG CGT GGC TCC CTC GGC GAC GTG 3 74 4 

Ser Thr Asn Asn Pro Trp Ala Ser Gin Arg Gly Ser Leu Gly Asp Val 
1235 1240 " 1245 
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CTA TAC AAT ATC ACC TTT CGC CAG ACT GCG CTG CCG GGC ATG TAC AGT 3192 
Leu Tyr Asn lie Thr Phe Arg Gin Thr Ala Leu Pro Glv Met Tvr Ser 
1250 1255 1260 

CCT TGT CGG CAG TTC TTC CAC AAG GAA GAC ATT ATG CGG TAC AAT AGG 3 84 0 

Pro Cys Arg Gin Phe Phe His Lys Glu Asp lie Met Arg Tyr Asn Arg 
1265 1270 1275 1280 

GGG TTG TAC ACT TTG GTT AAT GAG TAT TCT GCC AGG CTT. GCT GGG GCC 3888 
Gly Leu Tyr Thr Leu Val Asn Glu Tyr Ser Ala Arg Leu Ala Gly Ala 
1285 1290 1295 

CCC GCC ACC AGC ACT ACA GAC CTC CAG TAC GTC GTG GTC AAC GGT ACA 3 93 6 

Pro Ala Thr Ser Thr Thr Asp Leu Gin Tyr Val Val Val Asn Gly Thr 
1300 1305 1310 

GAC GTG TTT TTG GAC CAG CCT TGC CAT ATG CTG CAG GAG GCC TAT CCC 3 984 

Asp Val Phe Leu Asp Gin Pro Cys His Met Leu Gin Glu Ala Tyr Pro 
1315 1320 1325 

ACG CTC GCC GCC AGC CAC AGA GTT ATG CTT GCC GAG TAC ATG TCA AAC 4 03 2 

Thr Leu Ala Ala Ser His Arg Val Met Leu Ala Glu Tyr Met Ser Asn 
1330 1335 1340 

AAG CAG ACA CAC GCC CCA GTA CAC ATG GGC CAG TAT CTC ATT GAA GAG 4 08 0 

Lys Gin Thr His Ala Pro Val His Met Gly Gin Tyr Leu lie Glu Glu 
1345 1350 1355 1360 

GTG GCG CCG ATG AAG AGA CTA TTA AAG CTC GGA AAC AAG GTG GTG TAT 412 8 

Val Ala Pro Met Lys Arg Leu Leu Lys Leu Gly Asn Lys Val Val Tyr 
1365 1370 1375 



TAG 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1376 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

Met Glu Ala Thr Leu Glu Gin Arg Pro Phe Pro Tyr Leu Ala Thr Glu 
15 io 15 

Ala Asn Leu Leu Thr Gin lie Lys Glu Ser Ala Ala Asp Gly Leu Phe 
20 25 30 

Lys Ser Phe Gin Leu Leu Leu Gly Lys Asp Ala Arg Glu Gly Ser Val 
35 40 45 

Arg Phe Glu Ala Leu Leu Gly Val Tyr Thr Asn Val Val Glu Phe Val 
50 55 60 

Lys Phe Leu Glu Thr Ala Leu Ala Ala Ala Cvs Val Asn Thr Glu Phe 
65 70 75 80 

Lys Asp Leu Arg Arg Met lie Asp Gly Lys lie Gin Phe Lys lie Ser 
85 90 95 

Met Pro Thr lie Ala His Gly Asp Gly Arg Arg Pro Asn Lys Gin Arg 
100 105 no 
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Gin Tyr lie Val Met 
115 

lie Glu Leu Ala Ala 
130 

Thr Pro Leu Asp Phe 
145 

Ser Ala Leu Gin Phe 
165 

Thr Val Leu Ala Val 
180 

Lys Thr Leu Gly Asp 
195 

Val Lys Ser Asp Met 
210 

Ser Phe Phe Leu Asp 

225 

Val Leu Thr Met Leu 
245 

Val Phe Lys Gly Val 
260 

Ala Gly Val Leu Glu 
275 

Leu Leu Gly Gin Val 
290 

Ser Tyr Val Val Arg 
305 

Arg Ala Met Arg Asn 
325 

Pro Asn Ala Leu Pro 
340 

Gly His Asp Glu lie 
355 

lie Gly Asp Lys Phe 
370 

Glu Thr Gin Phe Pro 
365 

Phe Phe Pro Val Gly 
405 

Val Ser Val Arg Gly 
420 

Trp Val Val Asn Lys 
435 

Ala Leu Lys Ser lie 
450 

Ala Gin Ala Leu Asn 



181 



Lys Ala Cys Asn Lys His 
120 

Ala Asp lie Glu Leu Leu 
135 

Thr Glu Tyr Ala Gly Ala 
150 155 

Gly Met Asp Ala Leu Glu 
170 

Lys Leu Arg His Ala Pro 
1B5 

Pro Val Tyr Ser Glu Arg 
200 

Val Ser Met Phe Lys Ala 
215 

Lys Ala Glu Leu Met Thr 
230 235 

Ser Asp Met Leu Ala Ala 
250 

Ser Thr Tyr Thr Thr Ala 
265 

Thr Thr Asp Ser Val Met 
280 

Glu Ser Ala Met Ser Gly 
295 

Gly Ala Asn Leu Val Thr 
310 315 

Phe Glu Gin Phe Met Ala 
330 

Ser Val Glu Gly Asp Lys 
345 

Gin Arg Thr Arg lie Ala 
360 

Val Ala He Glu Ser Leu 
375 

Cys Pro Leu Asn Arg Arg 
390 395 

Leu His Leu Pro Val Pro 
410 

Val Glu Ser Pro Ala He 
425 

Asn Asn Val Pro Leu Cys 
440 

Cys His Pro Arg Met His 
455 

Gin Ala Phe Pro Asp Pro 



His He Glv Ala Glu 
125 

Phe Ala Glu Lys Glu 
140 

He Lys Thr He Thr 
160 

Arg Gly Leu Val Asp 
175 

Pro Val Phe He Leu 
190 

Gly Leu Lys Lys Ala 
205 

His Leu He Glu His 
220 

Arg Gly Lys Gin Tyr 
240 

Val Cys Glu Asp Thr 
255 

Ser Gly Gin Gin Val 
270 

Arg Arg Leu Met Asn 
265 

Pro Ala Ala Tyr Ala 
300 

Ala Val Ser Tyr Gly 
320 

Arg lie Val Asp His 
335 

Ala Ala Leu Ala Asp 
350 

Ala Ser Leu Val Lys 
365 

Gin Arg Met Tyr Asn 
380 

He Gin Tyr Thr Tyr 
400 

Arg Tyr Ser Thr Ser 
415 

Gin Ser Thr Glu Thr 
430 

Phe Gly Tyr Gin Asn 
445 

Asn Pro Thr Gin Ser 
460 

Asp Gly Gly His Gly 



WO 96/15779 



PCT/US95/15138 



182 

465 470 475 480 

Tyr Gly Leu Arg Tyr Glu Gin Thr Pro Asn Met Asn Leu Phe Arg Th- 
485 490 495 

Phe His Gin Tyr Tyr Met Gly Lys Asn Val Ala Phe Val Pro Asp Val 
500 505 510 

Ala Gin Lys Ala Leu Val Thr Thr Glu Asp Leu Leu His Pro Thr Ser 
515 520 525 

His Arg Leu Leu Arg Leu Glu Val His Pro Phe Phe Asp Phe Phe Val 
530 535 540 

His Pro Cys Pro Gly Ala Arg Gly Ser Tyr Arg Ala Thr His Arg Thr 
S4d 550 555 S 560 

Met Val Gly Asn He Pro Gin Pro Leu Ala Pro Arg Glu Phe Gin Glu 
565 570 575 

Ser Arg Gly Ala Gin Phe Asp Ala Val Thr Asn Met Thr His Val He 
580 585 590 

Asp Gin Leu Thr He Asp Val He Gin Glu Thr Ala Phe Asp Pro Ala 
595 600 605 

Tyr Pro Leu Phe Cys Tyr Val He Glu Ala Met He His Gly Gin Glu 
610 615 620 

Glu Lys Phe Val Met Asn Met Pro Leu He Ala Leu Val He Gin Thr 
625 630 635 640 

Tyr Trp Val Asn Ser Gly Lys Leu Ala Phe Val Asn Ser Tyr His Met 
645 650 655 

Val Arg Phe He Cys Thr His He Gly Asn Gly Ser lie Pro Lys Glu 
660 665 670 

Ala His Gly His Tyr Arg Lys He Leu Gly Glu Leu lie Ala Leu Glu 
675 680 685 

Gin Ala Leu Leu Lys Leu Ala Gly His Glu Thr Val Gly Arg Thr Pro 
690 695 700 

He Thr Hxs Leu Val Ser Ala Leu Leu Asp Pro His Leu Leu Pro Pro 
05 710 715 720 

Phe Ala Tyr His Asp Val Phe Thr Asp Leu Met Gin Lys Ser Ser Arg 
? 25 730 735 

Gin Pro lie lie Lys lie Gly Asp Gin Asn Tyr Asp Asn Pro Gin Asn 
740 . 745 750 

Arg Ala Thr Phe lie Asn Leu Arg Gly Arg Met Glu Aso Leu Val Asn 
7 S5 760 765 

Asn Leu Val Asn lie Tyr Gin Thr Arg Val Asn Glu Asp His Asp Glu 
770 775 780 

Arg His Val Leu Asp Val Ala Pro Leu Asp Glu Asn Asp Tyr Asn Pro 
765 790 795 boo 

Val Leu Glu Lys Leu Phe Tyr Tyr Val Leu Met Pro Val Cys Ser Asn 
80S ei0 815 

Gly His Met Cys Gly Met Gly Val Asp Tyr Gin Asn Val Ala Leu Thr 
820 825 630 
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Leu Thr Tyr Asn Gly Pro Val Phe Ala Asp Val Val Asn Ala Gin Asp 
835 840 845 

Asp He Leu Leu His Leu Glu Asn Gly Thr Leu Lys Asp He Leu Gin 
850 855 860 

Ala Gly Asp He Arg Pro Thr Val Asp Met He Arg Val Leu Cys Thr 
865 870 875 880 

Ser Phe Leu Thr Cys Pro Phe Val Thr Gin Ala Ala Arg Val He Thr 
885 890 895 

Lvs Arg Asp Pro Ala Gin Ser Phe Ala Thr His Glu Tyr Gly Lys Asp 
900 905 910 

Val Ala Gin Thr Val Leu Val Asn Gly Phe Gly Ala Phe Ala Val Ala 
915 920 925 

Asp Arg Ser Arg Glu Ala Ala Glu Thr Met Phe Tyr Pro Val Pro Phe 
930 935 940 

Asn Lys Leu Tyr Ala Asp Pro Leu Val Ala Ala Thr Leu His Pro Leu 
945 950 955 960 

Leu Pro Asn Tyr Val Thr Arg Leu Pro Asn Gin Arg Asn Ala Val Val 
965 970 975 

Phe Asn Val Pro Ser Asn Leu Met Ala Glu Tyr Glu Glu Trp His Lys 
980 985 990 

Ser Pro Val Ala Ala Tyr Ala Ala Ser Cys Gin Ala Thr Pro Gly Ala 
995 1000 1005 

He Ser Ala Met Val Ser Met His Gin Lys Leu Ser Ala Pro Ser Phe 
1010 1015 1020 

He Cys Gin Ala Lys His Arg Met His Pro Gly Phe Ala Met Thr Val 
1025 1030 1035 1040 

Val Arg Thr Asp Glu Val Leu Ala Glu His He Leu Tyr Cys Ser Arg 
1045 1050 1055 

Ala Ser Thr Ser Met Phe Val Gly Leu Pro Ser Val Val Arg Arg Glu 
1060 1065 1070 

Val Arg Ser Asp Ala Val Thr Phe Glu lie Thr His Glu He Ala Ser 
1075 1080 . 1085 

Leu His Thr Ala Leu Gly Tyr Ser Ser Val He Ala Pro Ala His Val 
1090 * 1095 1100 

Ala Ala He Thr Thr Asp Met Gly Val His Cys Gin Asp Leu Phe Met 
1105 ' 1110 1H5 H20 

He Phe Pro Gly Asp Ala Tyr Gin Asp Arg Gin Leu His Asp Tyr He 
1125 1130 H35 

Lys Met Lys Ala Gly Val Gin Thr Gly Ser Pro Gly Asn Arg Met Asp 
1140 1145 1150 

His Val Gly Tyr Thr Ala Gly Val Pro Arg Cys Glu Asn Leu Pro Gly 
1155 H60 1165 

Leu Ser His Gly Gin Leu Ala Thr Cys Glu He He Pro Thr Pro Val 
1170 1175 1180 

Thr Ser Asp Val Ala Tyr Phe Gin Thr Pro Ser Asn Pro Arg Giy Arg 
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1185 1190 1195 1200 

Ala Ala Ser Val Val Ser Cys Asp Ala Tyr Ser Asn Glu Ser Ala Glu 
1205 1210 1215 

Arg Leu Phe Tyr Asp His Ser He Pro Asp Pro Ala Tyr Glu Cys Arq 
1220 1225 123 0 

Ser Thr Asn Asn Pro Trp Ala Ser Gin Arg Gly Ser Leu Gly Asp Val 
1235 1240 12 45 

Leu Tyr Asn He Thr Phe Arg Gin Thr Ala Leu Pro Gly Met Tvr Se- 
1250 1255 126 0 

Pro Cys Arg Gin Phe Phe His Lys Glu Asp He Met Arg Tyr Asn Arg 
1265 1270 12 75 1280 

Gly Leu Tyr Thr Leu Val Asn Glu Tyr Ser Ala Arg Leu Ala Gly Ala 
1285 1290 1295 

Pro Ala Thr Ser Thr Thr Asp Leu Gin Tyr Val Val Val Asn Gly Thr 
1300 1305 1310 

Asp Val Phe Leu Asp Gin Pro Cys His Met Leu Gin Glu Ala Tyr Pro 
1315 1320 132 5 

Thr Leu Ala Ala Ser His Arg Val Met Leu Ala Glu Tyr Met Ser Asn 
1330 1335 1340 

Lys Gin Thr His Ala Pro Val His Met Gly Gin Tyr Leu He Glu Glu 
1345 1350 1355 1360 

Val Ala Pro Met Lys Arg Leu Leu Lys Leu Gly Asn Lys Val Val Tyr 
1365 1370 1375 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1143 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: N 

(iv) ANTI- SENSE : N 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

<B) LOCATION: 1 . . 114 3 

(D) OTHER INFORMATION: 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

£2r tTI ^ G ??* ^ G £5° T 1 AAC CTG CTC TAC GTA GAC GAG GCG AAT 4 8 

Ser He Arg Gly Gin Thr Phe Asn Leu Leu Tyr Val Asp Glu Ala Asn 

15 10 15 

TTT ATT AAA AAG GAT GCA CTG CCG GCT ATT CTG GGT TTC ATG CTT CAG 96 

Phe He Lys Lys Asp Ala Leu Pro Ala He Leu Gly Phe Met Leu Gin 
20 25 30 
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AAA GAC .GCC AAG CTT ATA TTT ATA TCA TCC GTG AAC TCG TCA GAC CGC 144 
Lys Asp Ala Lys Leu lie Phe lie Ser Ser Val Asn Ser Ser Asp Arg 
35 40 45 

TCC ACG AGT TTC CTG CTT AAC CTC AGG AAC GCC CAG GAA AAG ATG CTG 152 
Ser Thr Ser Phe Leu Leu Asn Leu Arg Asn Ala Gin Glu Lys Met Leu 
50 55 60 

AAT GTG GTC AGT TAC GTG TGT GCG GAC CAC CGA GAA GAT TTC CAC CTG 24 0 

Asn Val Val Ser Tyr Val Cys Ala Asp His Arg Glu Asp Phe His Leu 
65 70 75 80 

CAA GAC GCA CTA GTG TCC TGT CCT TGT TAC AGA CTG CAC ATT CCG ACG 28 8 

Gin Asp Ala Leu Val Ser Cys Pro Cys Tyr Arg Leu His lie Pro Thr 
85 * 90 95 

TAC ATC ACC ATC GAC GAA TCC ATC AAA ACC ACC ACC AAC CTC TTT ATG 336 
Tyr lie Thr lie Asp Glu Ser lie Lys Thr Thr Thr Asn Leu Phe Met 
100 105 110 

GAG GGG GCA TTC GAC ACC GAA CTA ATG GGC GAG GGA GCA GCG TCG TCA 3 84 

Glu Gly Ala Phe Asp Thr Glu Leu Met Gly Glu Gly Ala Ala Ser Ser 
115 120 125 

AAT GCT ACG CTT TAC CGC GTG GTG GGT GAC GCA GCG CTG ACA CAG TTT 43 2 

Asn Ala Thr Leu Tyr Arg Val Val Gly Asp Ala Ala Leu Thr Gin Phe 
130 135 140 

GAC ATG TGT CGG GTA GAC ACC ACC GCC CAG GAG GTT CAG AAG TGC CTT 4 80 

Asp Met Cys Arg Val Asp Thr Thr Ala Gin Glu Val Gin Lys Cys Leu 
145 150 155 160 

GGA AAA CAG CTG TTT GTT TAC ATC GAC CCC GCG TAT ACG AAC AAC ACG 52 8" 

Gly Lys Gin Leu Phe Val Tyr lie Asp Pro Ala Tyr Thr Asn Asn Thr 
165 170 175 

GAG GCG TCC GGT ACT GGC GTG GGC GCC GTT GTC ACG AGT ACT CAG ACT 5 76 

Glu Ala Ser Gly Thr Gly Val Gly Ala Val Val Thr Ser Thr Gin Thr 
1B0 185 190 

CCC ACC AGA AGC CTC ATA TTG GGC ATG GAG CAT TTC TTC CTG CGC GAC 624 
Pro Thr Arg Ser Leu He Leu Gly Met Glu His Phe Phe Leu Arg Asp 
195 200 205 

CTC ACT GGC GCA GCT GCT TAC GAG ATA GCG TCC TGC GCA TGC ACG ATG 6 72 

Leu Thr Gly Ala Ala Ala Tyr Glu He Ala Ser Cys Ala Cys Thr Met 
210 215 220 

ATT AAG GCG ATC GCT GTG CTC CAC ACC ACA ATT GAG CGC GTG AAC GCG 72 0 

He Lys Ala He Ala Val Leu His Thr Thr He Glu Arg Val Asn Ala 
225 230 235 240 

GCG GTC GAA GGC AAC AGC AGC CAA GAT TCT GGG GTG GCC ATT GCA ACC 76 8 

Ala Val Glu Gly Asn Ser Ser Gin Asp Ser Gly Val Ala He Ala Thr 
245 250 255 

GTC CTT AAC GAA ATA TGC CCG CTC CCC ATA CAT TTT CTA CAC TAT ACT 816 
Val Leu Asn Glu He Cys Pro Leu Pro He His Phe Leu His Tyr Thr 
260 265 270 

GAC AAG AGC AGC GCC CTG CAG TGG CCA ATT TAC ATG TTG GGA GGC GAG 8 64 

Asp Lys Ser Ser Ala Leu Gin Trp Pro He Tyr Met Leu Gly Gly Glu 
275 2B0 285 

AAA TCC TCC GCG TTT GAG ACA TTC ATC TAC GCT CTG AAC TCC GGC ACC 912 
Lys Ser Ser Ala Phe Glu Thr Phe He Tyr Ala Leu Asn Ser Gly Thr 
290 295 300 
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CTG AGC GCC AGC CAG ACG GTG GTG TCC AAC ACC ATC AAA ATA TCA TTT 96 0 

Leu Ser Aia Ser Gin Thr Val Val Ser Asn Thr lie Lvs lie Ser Phe 
305 310 315 ' 320 

GAC CCG GTG ACC TAC CTG GTA GAA CAG GTC CGC GCG ATC AAG TGC GTC 
Asp Pro Val Thr Tyr Leu Val Glu Gin Val Arg Ala He Lvs Cvs Va^ 
325 330 335 * 

CCG CTT AGG GAT GGA GGG CAG TCA TAC AGC GCC AAG CAA AAG CAC ATG 1056 
Pro Leu Arg Asp Gly Gly Gin Ser Tyr Ser Ala Lys Gin Lys His Met 
340 345 350 

TCG GAC GAC TTA CTT GTG GCA GTT GTC ATG GCC CAT TTT ATG GCT ACC 
Ser Asp Asp Leu Leu Val Ala Val Val Met Ala His Phe Met Ala Th- 
355 360 365 

GAT GAT AGA CAC ATG TAC AAG CCC ATA TCC CCA CAA TAA 
Asp Asp Arg His Met Tyr Lys Pro He Ser Pro Gin 
370 375 380 



1104 



1143 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 380 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 5 : 

Ser He Arg Gly Gin Thr Phe Asn Leu Leu Tyr Val Asp Glu Ala Asn 
1 5 io 



15 



Phe He Lys Lys Asp Ala Leu Pro Ala He Leu Gly Phe Met Leu Gin 
20 25 30 

Lys Asp Ala Lys Leu He Phe lie Ser Ser Val Asn Ser Ser Asd Aro 
35 40 45 

Ser Thr Ser Phe Leu Leu Asn Leu Arg Asn Ala Gin Glu Lys Met L-u 
50 55 60 

Asn Val Val Ser Tyr Val Cys Ala Asp His Arg Glu Asp Phe His Leu 
65 70 75 80 

Gin Asp Ala Leu Val Ser Cys Pro Cys Tyr Arg Leu His He Pro Th- 
85 go 95 

Tyr He Thr He Asp Glu Ser He Lys Thr Thr Thr Asn Leu Phe Me- 
100 105 no 

Glu Gly Ala Phe Asp Thr Glu Leu Met Gly Glu Gly Ala Ala Se- Ser 
115 120 125 

Asn Ala Thr Leu Tyr Arg Val Val Gly Asp Ala Ala Leu Thr Gin Phe 
130 135 140 

Asp Met Cys Arg Val Asp Thr Thr Ala Gin Glu Val Gin Lys Cys Leu 
145 150 155 160 

Gly Lys Gin Leu Phe Val Tyr He Asp Pro Ala Tyr Thr Asn Asn Th- 
165 170 175 

Glu Ala Ser Gly Thr Gly Val Gly Ala Val Val Thr Ser Thr Gin -h- 
I 80 165 190 
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Pro Thr Arg Ser Leu lie Leu Gly Met Glu His Phe Phe Leu Arg Asp 
195 200 2C5 

Leu Thr Gly Ala Ala Ala Tyr Glu He Ala Ser Cys Ala Cys Thr Met 
210 215 220 

He Lys Ala He Ala Val Leu His Thr Thr He Glu Arg Val Asn Ala 
225 230 235 240 

Ala Val Glu Gly Asn Ser Ser Gin Asd Ser Gly Val Ala He Ala Thr 
245 250 255 

Val Leu Asn Glu He Cys Pro Leu Pro He His Phe Leu His Tyr Thr 
260 265 270 

Asp Lvs Ser Ser Ala Leu Gin Trp Pro He Tyr Met Leu Gly Gly Glu 
275 280 285 

Lys Ser Ser Ala Phe Glu Thr Phe He Tyr Ala Leu Asn Ser Gly Thr 
290 295 300 

Leu Ser Ala Ser Gin Thr Val Val Ser Asn Thr He Lys He Ser Phe 
305 310 315 320 

Asp Pro Val Thr Tyr Leu Val Glu Gin Val Arg Ala He Lys Cys Val 
325 330 335 

Pro Leu Arg Asp Gly Gly Gin Ser Tyr Ser Ala Lys Gin Lys His Met 
340 345 350 

Ser Aso Asp Leu Leu Val Ala Val Val Met Ala His Phe Met Ala Thr 
355 360 365 

Asp Asp Ara His Met Tyr Lys Pro He Ser Pro Gin 
370 375 380 



(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 234 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: N 
(iv) ANTI- SENSE: N 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1 . . 234 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

ATG GGT GAG CCA GTG GAT CCT GGA CAT GTG GTG AAT GAG AAA GAT TTT 

Met Gly Glu Pro Val Asp Pro Gly His Val Val Asn Glu Lys Asp Phe 

15 10 15 

GAG GAG TGT GAA CAA TTT TTC AGT CAA CCC CTT AGG GAG CAA GTG GTC 

Glu Glu Cys Glu Gin Phe Phe Ser Gin Pro Leu Arg Glu Gin Val Val 

20 25 30 
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GCG GGG GTC AGG GCA CTC GAC GGC CTC GGT CTC GCT GAC TCT CTA TGT 144 
Ala Gly Val Arg Ala Leu Asp Gly Leu Gly Leu Ala Asp Se- Leu Cvs 
35 .40 45 

CAC AAA ACA GAA AGA CTC TGC CTG CTG ATG GAC CTG GTG GGC ACG GAG ,9. 
His Lys Thr Glu Arg Leu Cys Leu Leu Met Asp Leu Val Gly Thr G^u 
50 55 60 

TGC TTT GCG AGG GTG TGC CGC CTA GAC ACC GGT GCG AAA TGA 234 
Cys Phe Ala Arg Val Cys Arg Leu Asp Thr Gly Ala Lys 
65 70 75 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met Gly Glu Pro Val Asp Pro Gly His Val Val Asn Glu Lys Asp Phe 
1 5 10 



15 



Glu Glu Cys Glu Gin Phe Phe Ser Gin Pro Leu Arg Glu Gin Val Vai 
20 25 30 

Ala Gly Val Arg Ala Leu Asp Gly Leu Gly Leu Ala Asp Ser Leu Cys 
35 40 45 1 

His Lys Thr Glu Arg Leu Cys Leu Leu Met Asp Leu Val Gly Thr Glu 
50 55 60 

Cys Phe Ala Arg Val Cys Arg Leu Asp Thr Glv Ala Lvs 
65 70 75 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 58 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: N 
(iv) ANTI- SENSE: N 

( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .585 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO : B : 

ATG AAG ACT GTG GCG ACT CCC TTA TGT CAG TTC CAC GGC GTG TTT TGC 
Me. Lys Ser Val Ala Ser Pro Leu Cys Gin Phe His Gly Val Phe Cys 

CTG TAC CAG TGT CGC CAG TGC CTG GCA TAC CAC GTG TGT GAT GGG GGC 



4B 
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Leu Tyr Gin Cys Arg Gin Cys Leu Ala Tyr His Val Cys Asp Gly. Gly 
20 25 30 

GCC GAA TGC GTT CTC CTG CAT ACG CCG GAG AGC GTC ATC T3C GAA CTA 14 4 

Ala Glu Cys Val Leu Leu His Thr Pro Glu Ser Val He Cys Glu Leu 
35 40 45 

ACG GGT AAC TGC ATG CTC GGC AAC ATT CAA GAG GGC CAG TTT TTA GGG 192 
Thr Gly Asn Cys Met Leu Gly Asn He Gin Glu Gly Gin Phe Leu Gly 
50 55 60 

CCG GTA CCG TAT CGG ACT TTG GAT AAC CAG GTT GAC AGG GAC GCA TAT 24 0 

Pro Val Pro Tyr Arg Thr Leu Asp Asn Gin Val Asp Arg Asp Ala Tyr 
65 70 75 80 

CAC GGG ATG CTA GCG TGT CTG AAA CGG GAC ATT GTG CGG TAT TTG CAG 28S 
His Gly Met Leu Ala Cvs Leu Lys Arg Asp He Val Arg Tyr Leu Gin 
85 * 90 95 

ACA TGG CCG GAC ACC ACC GTA ATC GTG CAG GAA ATA GCC CTG GGG GAC 3 36 

Thr Trp Pro Asp Thr Thr Val He Val Gin Glu lie Ala Leu Gly Asp 
100 105 110 

GGC GTC ACC GAC ACC ATC TCG GCC ATT ATA GAT GAA ACA TTC GGT GAG 3 84 

Gly Val Thr Asp Thr He Ser Ala He He Asp Glu Thr Phe Gly Glu 
115 120 125 

TGT CTT CCC GTA CTG GGG GAG GCC CAA GGC GGG TAC GCC CTG GTC TGT 4 32 

Cys Leu Pro Val Leu Gly Glu Ala Gin Gly Gly Tyr Ala Leu Val Cys 
130 135 140 

AGC ATG TAT CTG CAC GTT ATC GTC TCC ATC TAT TCG ACA AAA ACG GTG 4 80 

Ser Met Tyr Leu His Val He Val Ser He Tyr Ser Thr Lys Thr Val 
145 150 155 160 

TAC AAC AGT ATG CTA TTT AAA TGC ACA AAG AAT AAA AAG TAC GAC TGC 52 8 

Tyr Asn Ser Met Leu Phe Lvs Cys Thr Lvs Asn Lys Lys Tvr Asp Cys 
165 170 175 

ATT GCC AAG CGG GTG CGG ACA AAA TGG ATG CGC ATG CTA TCA ACG AAA 5 76 

lie Ala Lys Arg Val Arg Thr Lys Trp Met Arg Met Leu Ser Thr Lys 
180 185 190 

GAT ACG TAG 585 
Asp Thr 

195 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 194 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

Met Lys Ser Val Ala Ser Pro Leu Cvs Gin Phe His Gly Val Phe Cys 
1 5 10 IS 

Leu Tyr Gin Cys Arg Gin Cys Leu Ala Tyr His Val Cys Asp Gly Gly 
20 25 30 

Ala Glu Cys Val Leu Leu His Thr Pro Glu Ser Val He Cys Glu Leu 
35 40 45 
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Thr Gly Asn Cys 
50 

Pro Val Pro Tyr 
65 

His Gly Met Leu 



Thr Trp Pro Asp 
100 

Gly Val Thr Asp 
115 

Cys Leu Pro Val 
130 

Ser Met Tyr Leu 
145 

Tyr Asn Ser Met 



lie Ala Lys Arg 
180 

Asp Thr 



Met Leu Gly Asn 
55 

Arg Thr Leu Asp 
70 

Ala Cys Leu Lys 
65 

Thr Thr Val He 



Thr He Ser Ala 
120 

Leu Gly Glu Ala 
135 

His Val He Val 
150 



Leu Phe Lys Cys 
165 

Val Arg Thr Lys 



190 



He Gin Glu Gly 
60 

Asn Gin Val Asd 
75 

Arg Asp He Val 
90 

Val Gin Glu He 
105 

He He Asp Glu 



Gin Gly Gly Tyr 
140 

Ser He Tyr Ser 
155 

Thr Lys Asn Lys 
170 



Trp Met Arg Met 
185 



Gin Phe Leu Gly 



Arg Asp Ala Tyr 
60 

Arg Tyr Leu Gin 
95 

Ala Leu Gly Asp 
110 

Thr Phe Gly Glu 
125 

Ala Leu Val Cys 



Thr Lys Thr Val 
160 

Lys Tyr Asp Cys 
175 

Leu Ser Thr Lys 
190 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: N 

(iv) ANT I - SENSE : N 

( ix ) FEATURE : 

(A) NAME /KEY : CDS 
<B> LOCATION: 1 . . 93 9 
(D) OTHER INFORMATION: 



£xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

ATG GCT AGC CGG AGG CGC AAA CTT CGG AAT TTC CTA AAC AAG GAA TGC 4 8 

Met Ala Ser Arg Arg Arg Lys Leu Arg Asn Phe Leu Asn Lys Glu Cys 
15 10 is 

ATA TGG ACT GTT AAC CCA ATG TCA GGG GAC CAT ATC AAG GTC TTT AAC 96 
He Trp Thr Val Asn Pro Met Ser Gly Asp His He Lys Val Phe Asn 
20 25 30 

GCC TGC ACC TCT ATC TCG CCG GTG TAT GAC CCT GAG CTG GTA ACC AGC 
Ala Cys Thr Ser He Ser Pro Val Tyr Asp Pro Glu Leu Val Thr Ser 
35 40 45 

TAC GCA CTG AGC GTG CCT GCT TAC AAT GTG TCT GTG GCT ATC TTG CTG 192 
Tyr Ala Leu Ser Val Pro Ala Tyr Asn Val Ser Val Ala He Leu Leu 
50 ss 60 



144 
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CAT AAA GTC ATG GGA CCG TGT GTG GCT GTG GGA ATT AAC GGA GAA ATG 24 0 

His Lys Val Met Gly Pro Cys Val Ala Val Gly He Asn Giy Glu Met ... 
65 " 70 75 80 

ATC ATG TAC GTC GTA AGC CAG TGT GTT TCT GTG CGG CCC GTC CCG GGG 286 
He Met Tyr Val Val Ser Gin Cys Val Ser Val Arg Pro Val Pro Gly 
85 90 95 

CGC GAT GGT ATG GCG CTC ATC TAC TTT GGA CAG TTT CTG GAG GAA GCA 3 36 

Arg Asp Gly Met Ala Leu He Tyr Phe Gly Gin Phe Leu Glu Glu Ala 
100 105 110 

TCC GGA CTG AGA TTT CCC TAC ATT GCT CCG CCG CCG TCG CGC GAA CAC 3 84 

Ser Gly Leu Arg Phe Pro Tyr He Ala Pro Pro Pro Ser Arg Glu His 
115 120 125 

GTA CCT GAC CTG ACC AGA CAA GAA TTA GTT CAT ACC TCC CAG GTG GTG 4 32 

Val Pro Asp Leu Thr Arg Gin Glu Leu Val His Thr Ser Gin Val Val 
130 135 140 

CGC CGC GGC GAC CTG ACC AAT TGC ACT ATG GGT CTC GAA TTC AGG AAT 4 80 

Arg Arg Gly Asp Leu Thr Asn Cys Thr Met Gly Leu Glu Phe Arg Asn 
145 150 155 160 

GTG AAC CCT TTT GTT TGG CTC GGG GGC GGA TCG GTG TGG CTG CTG TTC 52 8 

Val Asn Pro Phe Val Trp Leu Gly Gly Gly Ser Val Trp Leu Leu Phe 
165 170 175 

TTG GGC GTG GAC TAC ATG GCG TTC TGT CCG GGT GTC GAC GGA ATG CCG 5 76 

Leu Gly Val Asp Tyr Met Ala Phe Cys Pro Gly Val Asp Gly Met Pro 
180 185 190 

TCG TTG GCA AGA GTG GCC GCC CTG CTT ACC AGG TGC GAC CAC CCA GAC 624 
Ser Leu Ala Arg Val Ala Ala Leu Leu Thr Arg Cys Asp His Pro Asp 
195 200 205 

TGT GTC CAC TGC CAT GGA CTC CGT GGA CAC GTT AAT GTA TTT CGT GGG 6 72 

Cys Val His Cys His Gly Leu Arg Gly His Val Asn Val Phe Arg Giy 
210 215 ' ■ 220 

TAC TGT TCT GCG CAG TCG CCG GGT CTA TCT AAC ATC TGT CCC TGT ATC 72 0 

Tyr Cys Ser Ala Gin Ser Pro Gly Leu Ser Asn He Cys Pro Cys He 
225 230 235 240 

AAA TCA TGT GGG ACC GGG AAT GGA GTG ACT AGG GTC ACT GGA AAC AGA 76 8 

Lys Ser Cys Gly Thr Gly Asn Gly Val Thr Arg Val Thr Gly Asn Arg 
245 250 255 

AAT TTT CTG GGT CTT CTG TTC GAT CCC ATT GTC CAG AGC AGG GTA ACA 816 
Asn Phe Leu Gly Leu Leu Phe Asp Pro lie Val Gin Ser Arg Val Thr 
260 265 270 

GCT CTG AAG ATA ACT AGC CAC CCA ACC CCC ACG CAC GTC GAG AAT GTG 8 64 

Ala Leu Lys He Thr Ser His Pro Thr Pro Thr His Val Glu' Asn Val 
275 280 285 

CTA ACA GGA GTG CTC GAC GAC GGC ACC TTG GTG CCG TCC GTC CAA GGC 912 
Leu Thr Gly Val Leu Asp Asp Gly Thr Leu Val Pro Ser Val Gin Gly 
290 295 300 

ACC CTG GGT CCT CTT ACG AAT GTC TGA 93 9 

Thr Leu Gly Pro Leu Thr Asn Val 
305 310 

(2) INFORMATION FOR SEO. ID NO: 11: 
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<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 312 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Ala Ser Arg Arg Arg Lys Leu Arg Asn Phe Leu Asn Lys Glu Cvs 

1 5 in i t~ 



15 



He Trp Thr Val Asn Pro Met Ser Gly Asp His He Lys Val Phe Asn 
20 25 30 

Ala Cys Thr Ser He Ser Pro Val Tyr Asp Pro Glu Leu Val Thr Ser 
35 40 45 

Tyr Ala Leu Ser Val Pro Ala Tyr Asn Val Ser Val Ala He Leu Leu 
50 55 6 o 

His Lys Val Met Gly Pro Cys Val Ala Val Gly He Asn Gly Glu Met 
65 7 0 75 80 

He Met Tyr Val Val Ser Gin Cys Val Ser Val Arg Pro Val Pro Gly 
85 go 95 

Arg Asp Gly Met Ala Leu He Tyr Phe Gly Gin Phe Leu Glu Glu Ala 
100 105 no 

Ser Gly Leu Arg Phe Pro Tyr He Ala Pro Pro Pro Ser Arg Glu His 
115 120 125 

Val Pro Asp Leu Thr Arg Gin Glu Leu Val His Thr Ser Gin Val Val 
130 135 140 

Arg Arg Gly Asp Leu Thr Asn Cys Thr Met Gly Leu Glu Phe Arg Asn 
145 150 155 160 

Val Asn Pro Phe Val Trp Leu Gly Gly Gly Ser Val Trp Leu Leu Phe 
165 170 175 

Leu Gly Val Asp Tyr Met Ala Phe Cys Pro Gly Val Asp Gly Met Pro 
180 185 190 

Ser Leu Ala Arg Val Ala Ala Leu Leu Thr Arg Cys Asp His Pro Asp 
195 200 205 

Cys Val His Cys His Gly Leu Arg Gly His Val Asn Val Phe Arg Gly 
210 215 220 

Tyr Cys Ser Ala Gin Ser Pro Gly Leu Ser Asn He Cys Pro Cys He 
225 230 235 240 

Lys Ser Cys Gly Thr Gly Asn Gly Val Thr Arg Val Thr Gly Asn Arg 
2 45 250 255 

Asn Phe Leu Gly Leu Leu Phe Asp Pro He Val Gin Ser Arg Val Th- 
260 265 270 

Ala Leu Lys He Thr Ser His Pro Thr Pro Thr His Val Glu Asn Val 
275 280 285 

Leu Thr Gly Val Leu Asp Asp Gly Thr Leu Val Pro Ser Val Gin Glv 
290 295 300 

Thr Leu Gly Pro Leu Thr Asn Val 
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305 



310 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 86 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: N 
(iv) ANT I- SENSE: N 

(ix) FEATURE : 

(A) NAME / KEY : CDS 

(B) LOCATION: 1 . . 86 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: 

ATG GAC TCA ACC AAC TCT AAA AG A GAG TTT ATT AAG TCG GCT CTG GAG 
Met Asp ser Thr Asn Ser Lys Arg Glu Phe lie Lys Ser Ala Leu Glu 
15 10 15 

GCC AAC ATC AAC AGG AGG GCA GCT GTA TCG CTA TTT GA 
Ala Asn lie Asn Arg Arg Ala Ala Val Ser Leu Phe 
20 25 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Asp Ser Thr Asn Ser Lys Arg Glu Phe lie Lys Ser Ala Leu Glu 
15 10 15 

Ala Asn lie Asn Arg Arg Ala Ala Val Ser Leu Phe 
20 25 



(2) INFORMATION FOR SEQ ID NO: 14: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1743 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(iii) HYPOTHETICAL : N 
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(iv) ANT I - SENSE ; N 

( ix ) FEATURE : 

(A) NAME / KEY : CDS 

(B) LOCATION: 1..1743 
(D) OTHER INFORMATION: 



100 ids 



110 



46 



96 



144 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 

5™ ^ S? C G ? T 777 GGA GCG GAC TCG GTG GGG CGC GGC GGA GAA 

Met Ala Glu Gly Gly Phe Gly Ala Asp Ser Val Gly Arg Gly Gly Gl£ 

15 10 15 

AAG GCC TCT GTG ACT AGG GGA GGC AGG TGG GAC TTG GGG AGC TCG GAC 
Lys Ala Ser Val Thr Arg Gly Gly Arg Trp Asp Leu Gly Ser Ser Asp 
20 25 30 P 

GAC GAA TCA AGC ACC TCC ACA ACC AGC ACG GAT ATG GAC GAC CTC CCT 
Asp Glu Ser Ser Thr Ser Thr Thr Ser Thr Asp Met Asp Asp Leu Pro 
35 40 45 

GAG GAG AGG AAA CCA CTA ACG GGA AAG TCT GTA AAA ACC TCG TAC ATA 192 
Glu Glu Arg Lys Pro Leu Thr Gly Lvs Ser Val Lys Thr Se- Tvr He 
50 55 60 

TAC GAC GTG CCC ACC GTC CCG ACC AGC AAG CCG TGG CAT TTA ATG CAC 24 0 

Tyr Asp vai Pro Thr Val Pro Thr Ser Lys Pro Trp His Leu Met His 
65 70 75 80 

GAC AAC TCC CTC TAC GCA ACG CCT AGG TTT CCG CCC AGA CCT CTC ATA 288 
Asp Asn Ser Leu Tyr Ala Thr Pro Arg Phe Pro Pro Arg Pro Leu He 
85 go 95 

CGG CAC CCT TCC GAA AAA GGC AGC ATT TTT GCC AGT CGG TTG TCA GCG 
Arg His Pro Ser Glu Lys Gly Ser He Phe Ala Ser Arg Leu Se"r A^a 



336 



480 



528 



£5 ? GAC GAC TCG GGA GAC TAC GCG CCA ATG GAT CGC TTC GCC TTC 3 84 

Thr Asp Asp Asp Ser Gly Asp Tyr Ala Pro Met Asp Arg Phe Ala Phe 
115 120 125 

CAG AGC CCC AGG GTG TGT GGT CGC CCT CCC C~T CCC rrr — rr<r — - , 

Gin Ser Pro Arg Val Cvs Gly Arg Pro Pro Leu Pro Pro Pro* £n n7i 
130 135 140 

o CT GCA A 5 T AGG CCG GCA GAC GCG TCA ATG <^G GAC GTG GGC 

Pro Pro Pro Ala Thr Arg Pro Ala Asp Ala Ser Met Gly Asp Val Glv 

145 150 155 leb 

TGG GCG GAT CTG CAG GGA CTC AAG AGG ACC CCA AAG GGA TT TTA AAA 
Trp Ala Asp Leu Gin Gly Leu Lys Arg Thr Pro Lys Gly Phe Leu Lys 
l fiS 170 175 

l^l TH C ^ S? C AGT CTC *** GCC CGT GGA CGC G ^ GTA GGT 576 

Thr Ser Thr Lys Gly Gly Ser Leu Lys Ala Arg Gly Arg Asp Val Gly 

160 185 190 

GAC CGT CTC AGG GAC GGC GGC TTT GCC TTT AGT CCT AGG GGC GTG AAA 
Asp Arg Leu Arg Asp Gly Gly Phe Ala Phe Ser Pro Arg Gly Val Lvs 
195 200 205 

G ? C tT A ^ GG CAA AAC ATT *** TCA TGG ™ GGG ATC GGA GAA TCA 
Ser Ala He Gly Gin Asn He Lys Ser Trp Leu Gly He Gly Glu Ser 

210 215 220 

TCG GCG ACT GCT GTC CCC GTC ACC ACG CAG CTT ATG GTA CCG GTG CAC 72 0 



624 
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Ser Ala Thr Ala Val Pro Val Thr Thr Gin Leu Met Val Pro Val His 
225 230 235 240 

CTC ATT AGA ACG CCT GTG ACC GTG GAC TAC AGG AAT GTT TAT TTG CTT 76 8 

Leu lie Arg Thr Pro Val Thr Val Asp Tvr Arg Asn Val Tyr Leu Leu 
245 250 255 

TAC TTA GAG GGG GTA ATG GGT GTG GGC AAA TCA ACG CTG GTC AAC GCC 816 
Tyr Leu Glu Gly Val Met Gly Val Gly Lys Ser Thr Leu Val Asn Ala 
260 265 270 

GTG TGC GGG ATC TTG CCC CAG GAG AGA GTG ACA AGT TTT CCC GAG CCC 864 
Val Cys Gly lie Leu Pro Gin Glu Arg Val Thr Ser Phe Pro Glu Pro 
275 280 285 

ATG GTG TAC TGG ACG AGG GCA TTT ACA GAT TGT TAC AAG GAA ATT TCC * 912 

Met Val Tyr Trp Thr Arg Ala Phe Thr Asp Cys Tyr Lys Glu lie Ser 
290 " 295 300 

CAC CTG ATG AAG TCT GGT AAG GCG GGA GAC CCG CTG ACG TCT GCC AAA 96 0 

His Leu Met Lys Ser Gly Lys Ala Gly Asp Pro Leu Thr Ser Ala Lys 
305 310 315 320 

ATA TAC TCA TGC CAA AAC AAG TTT TCG CTC CCC TTC CGG ACG AAC GCC 1008 
lie Tyr Ser Cys Gin Asn Lys Phe Ser Leu Pro Phe Arg Thr Asn Ala 
325 330 335 

ACC GCT ATC CTG CGA ATG ATG CAG CCC TGG AAC GTT GGG GGT GGG TCT 1056 
Thr Ala lie Leu Arg Met Met Gin Pro Trp Asn Val Gly Gly Gly Ser 
340 345 350 

GGG AGG GGC ACT CAC TGG TGC GTC TTT GAT AGG CAT CTC CTC TCC CCA 1104 
Gly Arg Gly Thr His Trp Cys Val Phe Asp Arg His Leu Leu Ser Pro 
355 360 365 

GCA GTG GTG TTC CCT CTC ATG CAC CTG AAG CAC GGC CGC CTA TCT TTT 1152 
Ala Val Val Phe Pro Leu Met His Leu Lys His Gly Arg Leu Ser Phe 
370 375 380 

GAT CAC TTC TTT CAA TTA CTT TCC ATC TTT AGA GCC ACA GAA GGC GAC 12 0 0 

Asp His Phe Phe Gin Leu Leu Ser lie Phe Arg Ala Thr Glu Gly Asp 
385 390 395 400 

GTG GTC GCC ATT CTC ACC CTC TCC AGC GCC GAG TCG TTG CGG CGG GTC 12 4 8 

Val Val Ala lie Leu Thr Leu Ser Ser Ala Glu Ser Leu Arg Arg Val 
405 410 415 

AGG GCG AGG GGA AGA AAG AAC GAC GGG ACG GTG GAG CAA AAC TAC ATC 12 96 

Arg Ala Arg Gly Arg Lys Asn Asp Gly Thr Val Glu Gin Asn Tyr lie 
420 425 430 

■AGA GAA TTG GCG TGG GCT TAT CAC GCC GTG TAC TGT TCA TGG ATC ATG 1344 
Arg Glu Leu Ala Trp Ala Tyr His Ala Val Tyr Cys Ser Trp lie Met 
435 440 445 

TTG CAG TAC ATC ACT GTG GAG CAG ATG GTA CAA CTA TGC GTA CAA ACC 13 92 

Leu Gin Tyr He Thr Val Glu Gin Met Val Gin Leu Cys Val Gin Thr 
450 455 460 

ACA AAT ATT CCG GAA ATC TGC TTC CGC AGC GTG CGC CTG GCA CAC AAG 144 0 

Thr Asn He Pro Glu He Cys Phe Arg Ser Val Arg Leu Ala His Lys 
465 470 475 480 

GAG GAA ACT TTG AAA AAC CTT CAC GAG CAG AGC ATG CTA CCT ATG ATC 14 8 8 

Glu Glu Thr Leu Lys Asn Leu His Glu Gin Ser Met Leu Pro Met He 
485 490 495 
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ACC GGT GTA CTG GAT CCC GTG AGA CAT CAT CCC GTC GTG ATC GAG C m T 1 5 3 £ 

Tnr Gly Val Leu Asp Pro Val Arg His His Pro Val Val He Glu Leu 
500 505 510 

TGC TTT TGT TTC TTC ACA GAG CTG AGA AAA TTA CAA TTT ATC GTA GC i 584 

Cys Phe Cys Phe Phe Thr Glu Leu Arg Lys Leu Gin Phe He Val Ala 
515 520 525 

GAC GCG GAT AAG TTC CAC GAC GAC GTA TGC GGC CTG TGG ACC GAA AT" 16 3 

Asp Ala Asp Lys Phe His Asp Asp Val Cys Gly Leu Trp Thr Glu lie 
530 535 540 

TAC AGG CAG ATC CTG TCC AAT CCG GCT ATT AAA CCC AGG GCC AT<~ AAC 16 8 0 

Tyr Arg Gin He Leu Ser Asn Pro Ala He Lys Pro Arg Ala lie Asn 
545 555 560 

TGG CCA GCA TTA GAG AGC CAG TCT AAA GCA GTT AAT CAC CTA GAG GAG 
Trp Pro Ala Leu Glu Ser Gin Ser Lys Ala Val Asn His Leu Glu Glu 
565 570 575 



ACA TGC AGG GTC TAG 
Thr Cys Arg Val 
580 



(2) INFORMATION FOR SEQ ID NO: 15: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 580 amino acids 

(B) TYPE : amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 15: 

Met Ala Glu Gly Gly Phe Gly Ala Asp Ser Val Gly Arg Glv Glv Glu 
1 5 - n -, % 



15 



Lys Ala Ser Val Thr Arg Gly Gly Arg Trp Asp Leu Gly Ser Ser Asp 
20 25 30 

Asp Glu Ser Ser Thr Ser Thr Thr Ser Thr Asp Met Asp Asp Leu Pro 
35 40 45 

Glu Glu Arg Lys Pro Leu Thr Gly Lvs Ser Val Lys Th- Se- Tvr He 
50 55 go 

Tyr Asp Val Pro Thr Val Pro Thr Ser Lys Pro Trp His Leu Met His 
65 ™ 75 80 

Asp Asn Ser Leu Tyr Ala Thr Pro Arg Phe Pro Pro Arg Pro Leu He 
85 90 95 

Arg His Pro Ser Glu Lys Gly Ser He Phe Ala Ser Ara Leu Se- Ala 
100 io5 " no 

Thr Asp Asp Asp Ser Gly Asp Tyr Ala Pro Met Asp Arg Phe Ala Phe 
115 120 125 

Gin Ser Pro Arg Val Cys Gly Arg Pro Pro Leu Pro Pro Pro Asn His 
130 135 140 

Pro Pro Pro Ala Thr Arg Pro Ala Asp Ala Ser Met Gly Asp Val Glv 
145 150 155 16b 

Trp Ala Asp Leu Gin Gly Leu Lys Arg Thr Pro Lys Gly Phe Leu Lys 



1728 



1743 
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165 170 175 

Thr Ser Thr Lys Gly Gly Ser .Leu Lvs Ala Arg Gly Arg Asp Val Gly 
180 165 190 

Aso Arg Leu Arg Asp Gly Gly Phe Ala Phe Ser Pro Arg Gly Val Lys 
195 200 205 

Ser Ala lie Gly Gin Asn lie Lys Ser Trp Leu Gly lie Gly Glu Ser 
210 215 220 

Ser Ala Thr Ala Val Pro Val Thr Thr Gin Leu Met Val Pro Val His 
225 230 235 240 

Leu lie Arg Thr Pro Val Thr Val Asp Tyr Arg Asn Val Tyr Leu Leu 
245 250 255 

Tyr Leu Glu Gly Val Met Gly Val Gly Lys Ser Thr Leu Val Asn Ala 
260 265 270 

Val Cys Gly lie Leu Pro Gin Glu Arg Val Thr Ser Phe Pro Glu Pro 
275 280 285 

Met Val Tyr Trp Thr Arg Ala Phe Thr Asp Cys Tyr Lys Glu lie Ser 
290 295 300 

His Leu Met Lys Ser Gly Lys Ala Gly Asp Pro Leu Thr Ser Ala Lys 
305 310 315 320 

He Tvr Ser Cys Gin Asn Lys Phe Ser Leu Pro Phe Arg Thr Asn Ala 
325 330 335 

Thr Ala He Leu Arg Met Met Gin Pro Trp Asn Val Gly Gly Gly Ser 
340 345 350 

Glv Arg Glv Thr His Trp Cys Val Phe Asp Arg His Leu Leu Ser Pro 
35*5 360 365 

Ala Val Val Phe Pro Leu Met His Leu Lys His Gly Arg Leu Ser Phe 
370 , 375 380 

Asp His Phe Phe Gin Leu Leu Ser He Phe Arg Ala Thr Glu Gly Asp 
385 390 395 400 

Val Val Ala He Leu Thr Leu Ser Ser Ala Glu Ser Leu Arg Arg Val 
405 410 415 

Arg Ala Arg Gly Arg Lys Asn Asp Gly Thr Val Glu Gin Asn Tyr He 
420 425 430 

Arg Glu Leu Ala Trp Ala Tyr His Ala Val Tyr Cys Ser Trp He Met 
435 440 445 

Leu Gin Tyr He Thr Val Glu Gin Met Val Gin Leu Cys Val Gin Thr 
450 455 460 

Thr Asn lie Pro Glu He Cys Phe Arg Ser Val Arg Leu Ala His Lys 
465 470 475 480 

Glu Glu Thr Leu Lys Asn Leu His Glu Gin Ser Met Leu Pro Met He 
485 490 495 

Thr Gly Val Leu Asp Pro Val Arg His His Pro Val Val He Glu Leu 
500 505 510 

Cys Phe Cvs Phe Phe Thr Glu Leu Arg Lys Leu Gin Phe He Val Ala 
515 520 525 
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Asp Ala Asp Lys Phe His Asp Asp Val Cys Gly Leu Trp Thr Glu lie 
530 535 540 

Tyr Arg Gin lie Leu Ser Asn Pro Ala lie Lys Pro Arg Ala lie Asn 
545 550 555 560 

Trp Pro Ala Leu Glu Ser Gin Ser Lys Ala Val Asn His Leu Glu Glu 
565 570 575 

Thr Cys Arg Val 
580 

(2) INFORMATION FOR SEQ ID NO: 16: 

{ i ) S EQUENCE CHARACTER ISTICS : 

(A) LENGTH: 2193 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) . TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: N 

(iv) ANTI-SENSE: N 

(ix) FEATURE: 

£A) NAME/KEY: CDS 

(B) LOCATION: 1..2193 

(D) OTHER INFORMATION: 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

ATG CAG GGT CTA GCC TTC TTG GCG GCC CT GCA TGC T~~ r-t rrr *-r- 
Met Gin Gly Leu Ala Phe Leu Ala Ala" Leu A^a Cyl £g Cys lie 

10 15 

TCG TTG ACA TGT GGA GCC ACT GGC GCG TTG CCG ACA ACG GCG Arr trt 
Ser Leu Thr Cys Gly Ala Thr Gly Ala Leu Pro t£ Thr A^a £r THr 

25 30 

ATA ACC CGC TCC GCC ACG CAG CTC ATC AAT GGG AGA AC- AA- C— 
He Thr Arg Ser Ala Thr Gin Leu lie Asn Gly A^g Tnr £n Leu Ser 
35 40 45 

ATA GAA CTG GAA TTC AAC GGC ACT AGT TTT TTT CTA AAT TGG r^a 
He Glu Leu Glu Phe Asn Gly Thr Ser Phe Phe Leu £n T^ g£ A^n 
3U 55 en 



CTG TTG AAT GTG ATC ACG GAG CCG GCC CTG ACA GAG TTG TGG ACC TCC 
Leu Leu Asn Val He Thr Glu Pro Ala Leu Thr Glu lTu Thr Se' 

70 75 B i 

A^a vll A?a G^ t CTC AGG ° TA ACT CTG *** ^ G AG = CAA AGT 

Ala Glu Val Ala Glu Asp Leu Arg Val Thr Leu Lys Lvs Arg Gin Ser 

CTT TTT TTC CCC AAC AAG ACA GTT GTG ATC TCT GGA GAC GGC CAT CGC 
Leu. Phe Phe Pro Asn Lys Thr Val Val H e S er tly A^p G?y His Arg 
u 105 110 

TAT ACG TGC GAG GTG CCG ACG TCG TCG CAA ACT TAT AAC AT" AC- 
Tyr Thr Cys Glu Val Pro Thr Ser Ser Gin Thr £r £n He Thr £s 

115 12 0 125 

G<3C TTT AAC TAT AGC GCT CTG CCC GGG CAC CTT GGC GGA TTT GGG ATC 



48 



96 



144 



192 



240 



288 



336 



384 



432 
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Gly Phe Asn Tvr Ser Ala Leu Pro Gly His Leu Gly Gly Phe Gly lie 
130 135 140 

AAC GCG CGT CTG GTA CTG GGT GAT ATC TTC GCA TCA AAA TGG TCG CTA 4 6 0 

Asn Ala Arg Leu Val Leu Gly Asp He Phe Ala Ser Lys Trp Ser Leu 
145 150 155 160 

TTC GCG AGG GAC ACC CCA GAG TAT CGG GTG TTT TAC CCA ATG AAT GTC 52 8 

Phe Ala Arg Asp Thr Pro Glu Tyr Arg Val Phe Tyr Pro Met Asn Val 
165 170 175 

ATG GCC GTC AAG TTT TCC ATA TCC ATT GGC AAC AAC GAG TCC GGC GTA 5 76 

Met Ala Val Lvs Phe Ser He Ser He Gly Asn Asn Glu Ser Glv Val 
180 185 190 

GCG CTC TAT GGA GTG GTG TCG GAA GAT TTC GTG GTC GTC ACG CTC CAC 624 
Ala Leu Tyr Gly Val Val Ser Glu Asp Phe Val Val Val Thr Leu His 
195 200 205 

AAC AGG TCC AAA GAG GCT AAC GAG ACG GCG TCC CAT CTT CTG TTC GGT 6 72 

Asn Arg Ser Lys Glu Ala Asn Glu Thr Ala Ser His Leu Leu Phe Gly 
210 215 220 

CTC CCG GAT TCA CTG CCA TCT CTG AAG GGC CAT GCC ACC TAT GAT GAA 72 0 

Leu Pro Asp Ser Leu Pro Ser Leu Lys Gly His Ala Thr Tyr Asp Glu 
225 230 235 240 

CTC ACG TTC GCC CGA AAC GCA AAA TAT GCG CTA GTG GCG ATC CTG CCT 76 8 

Leu Thr Phe Ala Arg Asn Ala Lys Tyr Ala Leu Val Ala He Leu Pro 
245 250 255 

AAA GAT TCT TAC CAG ACA CTC CTT ACA GAG AAT TAC ACT CGC ATA TTT 816 
Lys Asp Ser Tvr Gin Thr Leu Leu Thr Glu Asn Tyr Thr Arg He Phe 
260 265 270 

CTG AAC ATG ACG GAG TCG ACG CCC CTC GAG TTC ACG CGG ACG ATC CAG 8 64 

Leu Asn Met Thr Glu Ser Thr Pro Leu Glu Phe Thr Arg Thr lie Gin 
.275 280 285 

ACC AGG ATC GTA TCA ATC GAG GCC AGG CGC GCC TGC GCA GCT CAA GAG 912 
Thr Arg He Val Ser He Glu Ala Arg Arg Ala Cys Ala Ala Gin Glu 
290 295 300 

GCG GCG CCG GAC ATA TTC TTG GTG TTG TTT CAG ATG TTG GTG GCA CAC 96 0 

Ala Ala Pro Asp He Phe Leu Val Leu Phe Gin Met Leu Val Ala His 
305 310 315 320 

TTT CTT GTT GCG CGG GGC ATT GCC GAG CAC CGA TTT GTG GAG GTG GAC 10 08 

Phe Leu Val Ala Arg Gly He Ala Glu His Arg Phe Val Glu Val Asp 
325 330 335 

TGC GTG TGT CGG CAG TAT GCG GAA CTG TAT TTT CTC CGC CGC ATC TCG 1056 
Cys Val Cys Arg Gin Tyr Ala Glu Leu Tyr Phe Leu Arg Arg He Ser 
340 345 350 

CGT CTG TGC ATG CCC ACG TTC ACC ACT GTC GGG TAT AAC CAC ACC ACC 1104 
Arg Leu Cvs Met Pro Thr Phe Thr Thr Val Gly Tyr Asn His Thr Thr 
355 360 365 

CTT GGC GCT GTG GCC GCC ACA CAA ATA GCT CGC GTG TCC GCC ACG AAG 1152 
Leu Gly Ala Val Ala Ala Thr Gin lie Ala Arg Val Ser Ala Thr Lys 
370 375 380 

TTG GCC AGT TTG CCC CGC TCT TCC CAG GAA ACA GTG CTG GCC ATG GTC 12 00 

Leu Ala Ser Leu Pro Arg Ser Ser Gin Glu Thr Val Leu Ala Met Val 
385 390 395 400 



WO 96/15779 



PCT/US95/15138 



200 



CAG CTT GGC GCC CGT GAT GGC GCC GTC CCT TCC TCC ATT CTG GAG GGC 124 
Gin Leu Gly Ala Arg Asp Gly Ala Val Pro Ser Ser lie Leu Glu Gly 
405 410 415 

ATT GCT ATG GTC GTC GAA CAT ATG TAT ACC GCC TAC ACT TAT GTG TAC 12 9 

lie Ala Met Val Val Glu His Met Tyr Thr Ala Tyr Thr Tyr Val Tyr 
420 425 430 

ACA CTC GGC GAT ACT GAA AGA AAA TTA ATG TTG GAC ATA CAC ACG GTC 134 
Thr Leu Gly Asp Thr Glu Arg Lys Leu Met Leu Asp He His Thr Val 
435 440 445 

CTC ACC GAC AGC TGC CCG CCC AAA GAC TCC GGA GTA TCA GAA AAG CTA 13 9. 

Leu Thr Asp Ser Cys Pro Pro Lys Asp Ser Gly Val Ser Glu Lys Leu 
450 455 460 

CTG AGA ACA TAT TTG ATG TTC ACA TCA ATG TGT ACC AAC ATA GAG CTG 14 4 0 

Leu Arg Thr Tyr Leu Met Phe Thr Ser Met Cys Thr Asn He Glu Leu 
465 470 475 480 

GGC GAA ATG ATC GCC CGC TTT TCC AAA CCG GAC AGC CTT AAC ATC TAT 14 8 B 

Gly Glu Met He Ala Arg Phe Ser Lys Pro Asp Ser Leu Asn He Tyr 
485 490 495 

AGG GCA TTC TCC CCC TGC TTT CTA GGA CTA AGG TAC GAT TTG CAT CCA 153 6 

Arg Ala Phe Ser Pro Cys Phe Leu Gly Leu Arg Tyr Asp Leu His Pro 
500 505 510 

GCC AAG TTG CGC GCC GAG GCG CCG CAG TCG TCC GCT CTG ACG CGG ACT 1584 
Ala Lys Leu Arg Ala Glu Ala Pro Gin Ser Ser Ala Leu Thr Arg Thr 
515 520 525 



GCC GTT GCC AGA GGA ACA TCG GGA TTC GCA GAA TTG CTC CAC GCG CTG 
Ala Val Ala Arg Gly Thr Ser Gly Phe Ala Glu Leu Leu His Ala Leu 
530 535 540 



1632 



CAC CTC GAT AGC TTA AAT TTA ATT CCG GCG ATT AAC TGT TCA AAG ATT 16 8 0 

His Leu Asp Ser Leu Asn Leu He Pro Ala lie Asn Cys Ser Lys He 
54 5 550 555 560 

ACA GCC GAC AAG ATA ATA GCT ACG GTA CCC TTG CCT CAC GTC ACG TAT 172 8 

Thr Ala Asp Lys lie lie Ala Thr Val Pro Leu Pro His Val Thr Tyr 
565 570 575 

ATC ATC AGT TCC GAA GCA CTC TCG AAC GCT GTT GTC TAC GAG GTG TCG 1776 
He lie Ser Ser Glu Ala Leu Ser Asn Ala Val Val Tyr Glu Val Ser 
560 585 590 

GAG ATC TTC CTC AAG AGT GCC ATG TTT ATA TCT GCT ATC AAA CCC GAT 1824 
Glu lie Phe Leu Lys Ser Ala Met Phe lie Ser Ala lie Lys Pro Asp 
595 600 60S 

TGC TCC GGC TTT AAC TTT TCT CAG ATT GAT AGG CAC ATT CCC. ATA GTC 18 72 

Cys Ser Gly Phe Asn Phe Ser Gin lie Asp Arg His lie Pro lie Val 
610 615 620 

TAC AAC ATC AGC ACA CCA AGA AGA GGT TGC CCC CTT TGT GAC TCT GTA 192 0 

Tyr Asn lie Ser Thr Pro Arg Arg Gly Cys Pro Leu Cys Asp Ser Val 
625 630 635 640 

ATC ATG AGC TAC GAT GAG AGC GAT GGC CTG CAG TCT CTC ATG TAT GTC 196 8 

lie Met Ser Tyr Asp Glu Ser Asp Gly Leu Gin Ser Leu Met Tvr Val 
645 650 655 

ACT AAT GAA AGG GTG CAG ACC AAC CTC TTT TTA GAT AAG TCA CCT TTC 2 016 

Thr Asn Glu Arg Val Gin Thr Asn Leu Phe Leu Asp Lvs Ser Pro Phe 
660 665 * 670 
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TTT GAT AAT AAC AAC CTA CAC ATT CAT TAT TTG TGG CTG AGG GAC AAC 2 06 4 

Phe Asp Asn Asn Asn Leu His lie His Tyr Leu Trp Leu Arc Asp Asr. 
€75 6B0 685 



GGG ACC GTA GTG GAG ATA AGG GGC ATG TAT AGA AGA CGC GCA GCC AGT 
Gly Thr Val Val Glu lie Arg Gly Met Tyr Arg Arg Arg Ala Ala Ser 
690 695 700 

GCT TTG TTT CTA ATT CTC TCT TTT ATT GGG TTC TCG GGG GTT ATC TAC 216C 
Ala Leu Phe Leu lie Leu Ser Phe He Gly Phe Ser Gly Val He Tyr 
705 710 715 720 

TTT CTT TAC AGA CTG TTT TCC ATC CTT TAT TAG 2193 
Phe Leu Tyr Arg Leu Phe Ser He Leu Tyr 
725 730 

(2) INFORMATION FOR SEQ ID NO: 17: 

( i ) S EQUENCE CHARACTER I ST I CS : 

(A) LENGTH: 73 0 amino acids 
{B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Met Gin Gly Leu Ala Phe Leu Ala Ala Leu Ala Cys Trp Arg Cys He 
15 10 15 

Ser Leu Thr Cvs Gly Ala Thr Gly Ala Leu Pro Thr Thr Ala Thr Thr 
20 25 30 

lie Thr Arg Ser Ala Thr Gin Leu He Asn Gly Arc Thr Asn Leu Ser 
35 40 45 

He Glu Leu Glu Phe Asn Gly Thr Ser Phe Phe Leu Asn Trp Gin Asn 
50 55 60 

Leu Leu Asn Val He Thr Glu Pro Ala Leu Thr Glu Leu Trp Thr Ser 
€5 70 75 80 

Ala Glu Val Ala Glu Asp Leu Arg Val Thr Leu Lys Lys Arg Gin Ser ' 
85 90 95 

Leu Phe Phe Pro Asn Lys Thr Val Val lie Ser Gly Asp Gly His Arg 
100 105 110 

Tyr Thr Cys Glu Val Pro Thr Ser Ser Gin Thr Tyr Asn lie Thr Lys 
115 120 125 

Gly Phe Asn Tyr Ser Ala Leu Pro Gly His Leu Gly Gly Phe Gly lie 
130 135 140 

Asn Ala Arg Leu Val Leu Gly Asp He Phe Ala Ser Lys Trp Ser Leu 
145 150 155 160 

Phe Ala Arg Asp Thr Pro Glu Tyr Arg Val Phe Tyr Pro Met Asn Val 
165 170 175 

Met Ala Val Lys Phe Ser lie Ser He Gly Asn Asn Glu Ser Gly Val 
180 185 190 

Ala Leu Tyr Gly Val Val Ser Glu Asp Phe Val Val Val Thr Leu His 
195 200 205 
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Asn Arg Ser Lys Glu Ala Asn Glu Thr Ala Ser His Leu Leu Phe g*> v 
210 215 220 

Leu Pro Asp Ser Leu Pro Ser Leu Lys Gly His Ala Thr Tvr Asp Glu 
225 230 235 * 240 

Leu Thr Phe Ala Arg Asn Ala Lys Tyr Ala Leu Val Ala He Leu Pro 
245 250 255 

Lys Asp Ser Tyr Gin Thr Leu Leu Thr Glu Asn Tyr Thr Arg He Phe 
260 265 270 

Leu Asn Met Thr Glu Ser Thr Pro Leu Glu Phe Thr Arg Thr He Gin 
275 280 285 

Thr Arg He Val Ser He Glu Ala Arg Arg Ala Cys Ala Ala Gin Glu 
290 295 300 

Ala Ala Pro Asp He Phe Leu Val Leu Phe Gin Met Leu Val Ala His 
305 310 315 320 

Phe Leu Val Ala Arg Gly He Ala Glu His Arg Phe Val Glu Val Asp 
325 330 335 

Cys Val Cys Arg Gin Tyr Ala Glu Leu Tyr Phe Leu Arg Arg He Ser 
340 345 350 

Arg Leu Cys Met Pro Thr Phe Thr Thr Val Gly Tyr Asn His Thr Thr 
355 360 365 

Leu Gly Ala Val Ala Ala Thr Gin He Ala Arg Val Ser Ala Thr Lys 
370 375 380 

Leu Ala Ser Leu Pro Arg Ser Ser Gin Glu Thr Val Leu Ala M<=>t Val 
385 390 395 400 

Gin Leu Gly Ala Arg Asp Gly Ala Val Pro Ser Ser He Leu Glu Gly 
405 410 4 1S 

He Ala Met Val Val Glu His Met Tvr Thr Ala Tyr T^r Tvr Val TV- 
420 425 430 

Thr Leu Gly Asp Thr Glu Arg Lys Leu Met Leu Asp He His Thr Val 
435 4 40 445 

Leu Thr Asp Ser Cys Pro Pro Lys Asp Ser Gly Val Ser Glu Lys Leu 
450 455 460 

Leu Arg Thr Tyr Leu Met Phe Thr Ser Met Cvs Thr Asn He Glu Leu 
465 470 475 480 

Gly Glu Met He Ala Arg Phe Ser Lys Pro Asp Ser Leu Asn He Tyr 
485 490 495 

Arg Ala Phe Ser Pro Cys Phe Leu Gly Leu Arg Tyr Asp Leu His Pro 
500 505 510 

Ala Lys Leu Arg Ala Glu Ala Pro Gin Ser Ser Ala Leu Thr Ara Thr 
515 520 525 

Ala Val Ala Arg Gly Thr Ser Gly Phe Ala Glu Leu Leu His Ala Leu 
530 535 S40 

His Leu Asp Ser Leu Asn Leu He Pro Ala He Asn Cvs Ser Lvs He 
54 5 550 555 " * l 6 0 

Thr Ala Asp Lys He He Ala Thr Val Pro Leu Pro His Val Thr Tyr 
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565 570 575 

lie lie Ser Ser Glu Ala Leu Ser Asn Ala Val Val Tyr Glu Val Ser 
580 585 590 

Glu lie Phe Leu Lys Ser Ala Met Phe lie Ser Ala He Lys Pro Asp 
595 600 605 

Cys Ser Glv Phe Asn Phe Ser Gin He Asp Arg His He Pro He Val 
610 " 615 620 

Tyr Asn He Ser Thr Pro Arg Arg Gly Cys Pro Leu Cys Asp Ser Val 
625 630 635 640 

He Met Ser Tyr Asp Glu Ser Asp Gly Leu Gin Ser Leu Met Tyr Val 
645 650 655 

Thr Asn Glu Arg Val Gin Thr Asn Leu Phe Leu Asp Lys Ser Pro Phe 
660 665 670 

Phe Asp Asn Asn Asn Leu His lie His Tyr Leu Trp Leu Arg Asp Asn 
675 680 685 

Gly Thr Val Val Glu He Arg Gly Met Tyr Arg Arg Arg Ala Ala Ser 
690 695 700 

Ala Leu Phe Leu He Leu Ser Phe He Gly Phe Ser Gly Val He Tyr 
70S 710 715 720 

Phe Leu Tyr Arg Leu Phe Ser lie Leu Tyr 
725 730 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1215 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: N 

(iv) ANTI- SENSE: N 

( ix ) FEATURE : 

(A) NAME / KEY : CDS 

( B ) LOCATION: 1..1215 

(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

ATG TTA CGA GTT CCG GAC GTG AAG GCT AGT CTA GTA GAG GGC GCG GCG 4 8 

Met Leu Arg Val Pro Asp Val Lys Ala Ser Leu Val Glu Gly Ala Ala 

15 10 15 

CGC CTG TCG ACA GGC GAG CGC GTG TTT CAC GTC TTG ACC TCT CCG GCG 96 

Arg Leu Ser Thr Gly Glu Arg Val Phe His Val Leu Thr Ser Pro Ala 

20 25 30 

GTG GCG GCC ATG GTG GGA GTC TCT AAT CCT GAA GTC CCG ATG CCA CTG 14 4 

Val Ala Ala Met Val Gly Val Ser Asn Pro Glu Val Pro Met Pro Leu 
35 40 45 

TTG TTC GAA AAG TTT GGG ACT CCG GAC TCG TCT ACC CTG CCA CTC TAC 192 
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Leu Phe Glu Lys Phe Gly Thr Pro Asp Ser Ser Thr Leu Pro Leu T\rr 
50 55 60 

GCG GCT AGG CAC CCG GAA CTA TCG TTG CTA CGG ATC ATG CTC TCA CCG 24 0 

Ala Ala Arg His Pro Glu Leu Ser Leu Leu Arg He Met Leu Ser Pre 
65 70 75 80 

CAC CCC TAC GCG TTA AGA AGC CAC TTG TGC GTA GGC GAA GAG ACC GCA 288 
His Pro Tyr Ala Leu Arg Ser His Leu Cys Val Gly Glu Glu Thr Ala 
85 90 95 

TCT CTT GGC GTT TAC CTG CAC TCC AAG CCA GTC GTA CGC GGC CAC GAA 336 
Ser Leu Gly Val Tyr Leu His Ser Lys Pro Val Val Arg Gly His Glu 
100 105 no 

TTC GAG GAC ACG CAG ATA CTA CCG GAG TGC CGG CTG GCC ATA ACG AGC 3 84 

Phe Glu Asp Thr Gin He Leu Pro Glu Cys Arg Leu Ala He Thr Ser 
115 120 125 

GAC CAG TCT TAT ACC AAC TTT AAG ATT ATA GAT CTG CCA GCG GGA TGC 43 2 

Asp Gin Ser Tyr Thr Asn Phe Lys He He Asp Leu Pro Ala Gly Cys 
130 135 140 

CGT CGC GTC CCC ATA CAC GCC GCG AAC AAG CGT GTC GTC ATC GAC GAG 48 0 

Arg Arg Val Pro He His Ala Ala Asn Lys Arg Val Val He Asp Glu 
145 150 155 160 

GCC GCC AAC CGC ATA AAG GTG TTT GAC CCA GAG TCG CCT TTA CCG CGT 52 8 

Ala Ala Asn Arg He Lys Val Phe Asp Pro Glu Ser Pro Leu Pro Arg 
165 170 175 

CAC CCC ATA ACA CCC CGT GCC GGT CAG ACC AGA TCT ATA CTG AAA CAC 576 
His Pro He Thr Pro Arg Ala Gly Gin Thr Arg Ser He Leu Lys His 
180 185 190 

AAC ATC GCA CAG GTT TGC GAA CGG GAT ATC GTG TCA CTT AAC ACA GAC 624 
Asn He Ala Gin Val Cys Glu Arg Asp He Val Ser Leu Asn Thr Asp 
195 200 205 

AAC GAG GCC GCG TCT ATG TTC TAC ATG ATT GGA CTC AGG CGG CCG AGA 6 72 

Asn Glu Ala Ala Ser Met Phe Tyr Met He Gly Leu Arq Ara Pro Ara 
210 215 220 

CTC GGA GAA AGC CCG GTC TGT GAC TTC AAC ACC GTT ACC ATC ATG GAG 72 0 

Leu Gly Glu Ser Pro Val Cys Asp Phe Asn Thr Val Thr He Met Glu 
225 230 235 240 

CGT GCT AAC AAC TCG ATA ACT TTT CTA CCC AAG CTA AAA CTG AAC CGG 76 8 

Arg Ala Asn Asn Ser He Thr Phe Leu Pro Lys Leu Lys Leu Asn Ara 
245 250 255 

CTA CAA CAC CTG TTC CTG AAG CAC GTG TTG CTG CGC AGC ATG GGG CTG 816 
Leu Gin His Leu Phe Leu Lys His Val Leu Leu Arg Ser Met Gly Leu 
260 265 270 

GAA AAC ATC GTG TCG TGT TTC TCA TCG CTG TAC GGC GCA GAA CTT GCC 864 
Glu Asn He Val Ser Cys Phe Ser Ser Leu Tyr Gly Ala Glu Leu Ala 
275 280 285 

CCT GCG AAA ACA CAC GAG CGG GAG TTC TTC GGC GCT CTG CTA GAA AGA 912 
Pro Ala Lys Thr His Glu Axg Glu Phe Phe Gly Ala Leu Leu Glu Arg 
290 295 300 

CTC AAA CGT CGG GTG GAG GAC GCG GTC TTC TGC CTG AAT ACC ATA GAG 96 0 

Leu Lys Arg Arg Val Glu Asp Ala Val Phe Cvs Leu Asn Thr He Glu 
3 °5 310 315 320 
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GAT TTC CCG TTT AGG GAA CCC ATT CGC CAA CCC CCA GAT TGT TCC AAG 
Asp Phe Pro Phe Arg Glu Pro lie Arg Gin Pro Pro Asp Cys Ser Lys 
325 330 335 



100E 



GTG CTT ATA GAA GCC ATG GAA AAG TAC TTT ATG ATG TGT AGC CCC AAA 105 5 

Val Leu He Glu Ala Met Glu Lys Tyr Phe Met Met Cys Ser Pro Lys 

340 345 350 

GAC CGT CAA AGC GCC GCA TGG CTA GGT GCA GGG GTG GTC GAA CTG ATA 1104 

Asp Arg Gin Ser Ala Ala Trp Leu Gly Ala Gly Val Val Giu Leu H.e 
355 360 365 

TGT GAC GGC AAT CCA CTT TCT GAG GTG CTC GGA TTT CTT GCC AAG TAT 1152 

Cys Asp Gly Asn Pro Leu Ser Glu Val Leu Gly Phe Leu Ala Lys Tyr 
370 375 380 

ATG CCC ATA CAA AAA GAA TGC ACA GGA AAC CTT TTA AAA ATC TAC GCT 12 00 

Met Pro He Gin Lys Glu Cys Thr Gly Asn Leu Leu Lys He Tyr Ala 
385 390 395 400 

TTA TTG ACC GTC TAA 1215 
Leu Leu Thr Val 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 04 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Met Leu Arg Val Pro Asp Val Lys Ala Ser Leu Val Glu Gly Ala Ala 
15 10 15 

Arg Leu Ser Thr Gly Glu Arg Val Phe His Val Leu Thr Ser Pro Ala 
20 25 30 

Val Ala Ala Met Val Gly Val Ser Asn Pro Glu Val Pro Met Pro Leu 
35 40 45 

Leu Phe Glu Lys Phe Gly Thr Pro Asp Ser Ser Thr Leu Pro Leu Tyr 
50 55 60 

Ala Ala Arg His Pro Glu Leu Ser Leu Leu Arg He Met Leu Ser Pro 
65 70 75 80 

His Pro Tyr Ala Leu Arg Ser His Leu Cys Val Gly Glu Glu Thr Ala 
85 90 95 

Ser Leu Gly Val Tyr Leu His Ser Lys Pro Val Val Arg Gly His Glu 
100 105 110 

Phe Glu Asp Thr Gin He Leu Pro Glu Cys Arg Leu Ala He Thr Ser 
115 120 125 

Asp Gin Ser Tyr Thr Asn Phe Lys He He Asp Leu Pro Ala Gly Cys 
130 135 140 

Arg Arg Val Pro He His Ala Ala Asn Lys Arg Val Val He Asp Glu 
145 150 155 160 

Ala Ala Asn Arg He Lys Val Phe Asp Pro Glu Ser Pro Leu Pro Arg 
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165 170 175 

His Pro lie Thr Pro Arg Ala Gly Gin Thr Arg Ser He Leu Lvs His 
180 185 190 

Asn He Ala Gin Val Cys Glu Arg Asp He Val Ser Leu Asn Thr Asp 
155 200 205 

Asn Glu Ala Ala Ser Met Phe Tyr Met He Gly Leu Ara Arg Pro Ara 
210 215 220 ~ 

Leu Gly Glu Ser Pro Val Cys Asp Phe Asn Thr Val Thr He Met Glu 
225 230 235 240 

Arg Ala Asn Asn Ser lie Thr Phe Leu Pro Lys Leu Lys Leu Asn Arg 
245 250 255 

Leu Gin His Leu Phe Leu Lys His Val Leu Leu Arg Ser Met Gly Leu 
260 265 270 

Glu Asn He Val Ser Cys Phe Ser Ser Leu Tyr Gly Ala Glu Leu Ala 
275 280 285 

Pro Ala Lys Thr His Glu Arg Glu Phe Phe Gly Ala Leu Leu Glu Arg 
290 295 300 

Leu Lys Arg Arg Val Glu Asp Ala Val Phe Cys Leu Asn Thr He Glu 
305 310 315 320 

Asp Phe Pro Phe Arg Glu Pro He Arg Gin Pro Pro Asp Cys Ser Lys 
325 330 335 

Val Leu He Glu Ala Met Glu Lys Tvr Phe Met Met Cys Ser Pro Lys 
340 345 350 

Asp Arg Gin Ser Ala Ala Trp Leu Gly Ala Gly Val Val Glu Leu He 
355 360 365 

Cys Asp Gly Asn Pro Leu Ser Glu Val Leu Gly Phe Leu Ala Lvs Tvr 
370 375 380 

Met Pro He Gin Lys Glu Cys Thr Glv Asn Leu Leu Lvs He Tvr Ala 
385 390 395 ' ' 400 

Leu Leu Thr Val 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2259 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: N 
(iv) ANTI- SENSE: N 

(ix) FEATURE : 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..2259 
(D) OTHER INFORMATION: 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 20: 

ATG GCA GCG CTC GAG GGC CCC CTA CTA CTG CCA CCG AGC GCC 7CC CTG 
Met Ala Ala Leu Glu Gly Pro Leu Leu Leu Pro Pro Ser Ala Ser Leu 
1 5 10 15 

ACG ACG AGT CCG CAG ACC ACG TGT TAT CAA GCG ACT TGG GAA TCA CAG 
Thr Thr Ser Pro Gin Thr Thr Cys Tyr Gin Ala Thr Trp Glu Ser Gin 
20 25 30 

CTG GAA ATA TTC TGC TGT CTG GCC ACC AAC TCG CAC CTG CAG GCA GAG 14 4 

Leu Glu He Phe Cys Cys Leu Ala Thr Asn Ser His Leu Gin Ala Glu 
35 40 45 

CTG ACC TTA GAA GGT CTT GAT AAG ATG ATG CAG CCC GAG CCC ACC TTT 192 
Leu Thr Leu Glu Gly Leu Asp Lys Met Met Gin Pro Glu Pro Thr Phe 
50 55 60 

TTC GCC TGC AGA GCG ATA CGC AGA CTA CTC CTG GGG GAA CGC CTC CAC 24 0 

Phe Ala Cys Arg Ala He Arg Arg Leu Leu Leu Gly Glu Arg Leu His 
65 70 75 80 

CCT TTT ATA CAT CAA GAA GGG ACT CTT TTG GGA AAA GTG GGT CGA CGG 28 8 

Pro Phe He His Gin Glu Gly Thr Leu Leu Gly Lys Val Gly Arg Arg 
85 90 95 

TAC AGC GGC GAA GGT TTA ATA ATT GAC GGT GGT GGA GTG TTT ACG CGC 3 36 

Tyr Ser Gly Glu Gly Leu He He Asp Gly Gly Gly Val Phe Thr Arg 
100 105 HO 

GGA CAG ATA GAC ACC GAC AAC TAC CTA CCT GCG GTG GGA TCA TGG GAA 3 84 

Glv Gin He Asp Thr Asp Asn Tyr Leu Pro Ala Val Gly Ser Trp Glu 
115 120 125 

CTT ACC GAT GAT TGT GAT AAA CCC TGC GAA TTC AGG GAG CTA CGC TCG 432 
Leu Thr Asd Asp Cys Asp Lys Pro Cys Glu Phe Arg Glu Leu Arg Ser 
230 * 135 140 

CTG TAT CTT CCC GCG CTA CTA ACG TGC ACC ATA TGT TAC AAA GCC ATG 480 
Leu Tyr Leu Pro Ala Leu Leu Thr Cys Thr He Cys Tyr Lys Ala Met: 
145 150 155 160 

TTC AGG ATA GTG TGC AGG TAC CTG GAG TTC TGG GAG TTC GAA CAG TGT 52 8 

Phe Arg He Val Cys Arg Tyr Leu Glu Phe Trp Glu Phe Glu Gin Cys 
165 170 175 

TTT CAT GCG TTT CTG GCG GTG TTG CCC CAT AGT CTA CAA CCC ACA ATC 5 76 

Phe His Ala Phe Leu Ala Val Leu Pro His Ser Leu Gin Pro Thr He 
180 185 190 

TAT CAA AAT TAT TTT GCA CTC CTG GAG AGC CTG AAG CAT CTC TCG TTT 624 
Tyr Gin Asn Tyr Phe Ala Leu Leu Glu Ser Leu Lys His Leu Ser Phe 
195 200 205 

TCA ATA ATG CCA CCC GCA TCC CCA GAC GCA CAG CTA CAT TTT TTA AAG 6 72 

Ser He Met Pro Pro Ala Ser Pro Asp Ala Gin Leu His Phe Leu Lys. 
210 215 220 

TTT AAC ATC AGC AGC TTC ATG GCC ACG TGG GGG TGG CAC GGA GAG CTG 72 0 

Phe Asn He Ser Ser Phe Met Ala Thr Trp Gly Trp His Gly Glu Leu 
225 230 235 240 

GTC TCG CTG CGC CGT GCC ATC GCT CAC AAC GTA GAG CGA CTG CCC ACC 
Val Ser Leu Arg Arg Ala He Ala His Asn Val Glu Arc Leu Pro Thr 
245 250 255 

GTG CTG AAG AAC CTG TCG AAA CAG AGT AAG CAC CAG GAC GTC AAG GTT 816 



768 
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Val Leu Lys Asn Leu Ser Lys Gin Ser Lys His Gin Asp Val Lys Val 
260 265 270 

AAC GGA CGG GAT CTG GTG GGC TTT CAG CTG GCT CTA AAC CAG CTC GTG 8 6 4 

Asn Gly Arg Asp Leu Val Gly Phe Gin Leu Ala Leu Asn Gin Leu Val 
275 280 285 

TCC CGT CTG CAC GTA AAA ATC CAA CGC AAG GAC CCC GGA CCA AAG CCA 912 
Ser Arg Leu His Val Lys lie Gin Arg Lys Asp Pro Gly Pro Lvs Pro 
290 295 300 

TAC AGG GTG GTC GTC AGT ACC CCA GAT TGT ACC TAC TAT CTA GTG TAT 96 0 

Tyr Arg Val Val Val Ser Thr Pro Asp Cys Thr Tyr Tyr Leu Val Tyr 
305 310 315 320 

CCG GGC ACA CCG GCC ATC TAC AGA CTC GTC ATG TGT ATG GCA GTG GCA 1006 
Pro Gly Thr Pro Ala lie Tyr Arg Leu Val Met Cys Met Ala Val Ala 
325 330 335 

GAC TGC ATC GGC CAC TCG TGC AGC GGA CTG CAC CCC TGC GCA AAC TTT 105 6 

Asp Cys lie Gly His Ser Cys Ser Gly Leu His Pro Cys Ala Asn Phe 
340 345 350 

TTA GGC ACC CAC GAG ACA CCG CGT CTC CTG GCG GCG ACG CTT TCA AGA 1104 
Leu Gly Thr His Glu Thr Pro Arg Leu Leu Ala Ala Thr Leu Ser Arg 
355 360 365 

ATC CGG TAC GCG CCG AAA GAC CGG CGA GCA GCC ATG AAA GGA AAT TTG 1152 
He Arg Tyr Ala Pro Lys Asp Arg Arg Ala Ala Met Lys Gly Asn Leu 
370 375 380 

CAG GCG TGC TTC CAA CGA TAC GCG GCC ACG GAC GCG CGG ACT CTG GGC 12 00 

Gin Ala Cys Phe Gin Arg Tyr Ala Ala Thr Asp Ala Arg Thr Leu Glv 
3B5 390 395 400 

AGC TCT ACA GTG TCA GAC ATG CTG GAA CCC ACA AAA CAC GTC AGT TTG 124 8 

Ser Ser Thr Val Ser Asp Met Leu Glu Pro Thr Lys His Val Ser Leu 
405 410 415 

GAA AAC TTC AAG ATC ACC ATA TTC AAC ACC AAC ATG GTG ATT AAC ACT 12 96 

Glu Asn Phe Lys He Thr He Phe Asn Thr Asn Met Val He Asn Thr 
420 425 430 

AAG ATA AGC TGC CAC GTT CCT AAC ACC CTG CAA AAG ACT ATT TTA AAC 1344 
Lys He Ser Cys His Val. Pro Asn Thr Leu Gin Lys Thr lie Leu Asn 
435 440 445 

ATC CCC AGA TTG ACC AAC AAT TTT GTT ATA CGA AAG TAC TCC GTA AAG 13 92 

He Pro Arg Leu Thr Asn Asn Phe Val He Arg Lvs Tyr Ser Val Lys 
450 455 460 

GAA CCT TCT TTT ACC ATA AGC GTG TTT TTT TCC GAC AAC ATG TGT CAA 144 0 

Glu Pro Ser Phe Thr He Ser Val Phe Phe Ser Asp Asn Met Cvs Gin 
465 470 475 " 480 

GGC ACC GCA ATA AAC ATC AAC ATC AGT GGG GAC ATG CTG CAC TTT CTC 14 8 8 

Gly Thr Ala He Asn He Asn He Ser Gly Asp Met Leu His Phe Leu 
485 490 495 

TTC GCA ATG GGT ACG CTG AAA TGC TTT CTG CCA ATC AGG CAC ATA TTT 15 36 

Phe Ala Met Gly Thr Leu Lys Cys Phe Leu Pro He Arg His He Phe 
500 505 510 

CCT GTA TCG ATA GCA AAT TGG AAC TCC ACG TTG GAC CTG CAC GGA CTG 15 84 

Pro Val Ser He Ala Asn Trp Asn Ser Thr Leu Asp Leu His Gly Leu 
515 520 525 
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GAA AAC CAG TAC ATG GTG AGA ATG GGG CGA AAA AAC GTA TTT TGG ACC 16 2 2 

Glu Asn Gin Tvr Met Val Arg Met Gly Arg Lys Asn Val Pne Trp Tr.r 
530 * 535 540 

ACA AAC TTT CCA TCT GTG GTC TCC AGC AAG GAT GGG CTA AAC GTG TCC 16 8 0 

Thr Asn Phe Pro Ser Val Val Ser Ser Lys Asp Gly Leu Asn Val Ser 
545 550 555 560 

TGG TTT AAG GCC GCG ACA GCC ACG ATT TCT AAA GTG TAC GGG CAG CCT 172 6 

Trp Phe Lys Ala Ala Thr Ala Thr He Ser Lys Val Tyr Gly Gin Pre 
565 570 575 

CTT GTG GAA CAG ATT CGC CAC GAG CTG GCG CCC ATT CTC ACG GAC CAG 1776 
Leu Val Glu Gin He Arg His Glu Leu Ala Pro He Leu Thr Asp Gin 
580 585 590 

CAC GCG CGC ATC GAC GGA AAC AAA AAT AGA ATA TTC TCC CTA CTT GAG 1824 
His Ala Arg He Asp Gly Asn Lys Asn Arg lie Phe Ser Leu Leu Glu 
595 600 605 

CAC AGA AAC CGT TCC CAA ATA CAG ACG CTA CAC AAA AGG TTC CTG GAG 18 72 

His Arg Asn Arg Ser Gin He Gin Thr Leu His Lys Arg Phe Leu Glu 
610 615 620 

TGT CTG GTG GAA TGC TGT TCG TTT CTC AGG CTT GAC GTG GCT TGC ATT 192 0 

Cvs Leu Val Glu Cys Cvs Ser Phe Leu Arg Leu Asp Val Ala Cys lie 
625 6*30 635 640 

AGG CGA GCC GCC GCC CGG GGC CTG TTT GAC TTC TCA AAG AAG ATA ATC 196 B 

Arg Arg Ala Ala Ala Arg Gly Leu Phe Asp Phe Ser Lys Lys lie lie 
645 650 655 

AGT CAC ACT AAA AGC AAA CAC GAG TGC GCA GTA CTG GGA TAT AAA AAG 2 016 

Ser His Thr Lvs Ser Lys His Glu Cys Ala Val Leu Gly Tyr Lys Lys 
660 665 670 

TGT AAC CTA ATC CCG AAA ATC TAT GCC CGA AAC AAG AAG ACC AGG CTA 2 064 

Cys Asn Leu He Pro Lys lie Tyr Ala Arg Asn Lys Lys Thr Arg Leu 
675 680 635 

GAC GAG TTG GGC CGC AAT GCA AAC TTC ATT TCG TTC GTC GCC ACC ACG 2112 
Asp Glu Leu Gly Arg Asn Ala Asn Phe He Ser Phe Val Ala Thr Thr 
690 695 700 

GGT CAT CGG TTC GCC GCT CTA AAG CCA CAA ATT GTC CGT CAC GCC ATT 2160 
Glv His Arg Phe Ala Ala Leu Lys Pro Gin He Val Arg His Ala lie 
70*5 710 715 720 

CGC AAA CTA GGC CTG CAC TGG CGC CAC CGA ACG GCC GCG TCC AAC GAG 22 08 

Arg Lys Leu Gly Leu His Trp Arg His Arg Thr Ala Ala Ser Asn Glu 
725 730 735 

CAG ACA CCG CCA GCC GAT CCC CGC GTA CGT TGC GTC CGT CCG CTG GTC 2 2 56 

Gin Thr Pro Pro Ala Asp Pro Arg Val Arg Cys Val Arg Pro Leu Vai 
740 745 750 



TAA 



2259 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 752 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Met Ala Ala Leu Glu Gly Pro Leu Leu Leu Pro Pro Ser Ala Ser Leu 
1 5 10 15 

Thr Thr Ser Pro Gin Thr Thr Cys Tyr Gin Ala Thr Trp Glu Ser Gin 
20 25 30 

Leu Glu lie Phe Cys Cys Leu Ala Thr Asn Ser His Leu Gin Ala Glu 
35 40 45 

Leu Thr Leu Glu Gly Leu Asp Lys Met Met Gin Pro Glu Pro Thr Phe 
50 55 60 

Phe Ala Cys Arg Ala lie Arg Arg Leu Leu Leu Gly Glu Arg Leu His 
65 70 75 80 

Pro Phe lie His Gin Glu Gly Thr Leu Leu Gly Lys Val Gly Arg Arg 
85 90 95 

Tyr Ser Gly Glu Gly Leu lie He Asp Gly Gly Gly Val Phe Thr Arg 
100 105 no 

Gly Gin He Asp Thr Asp Asn Tyr Leu Pro Ala Val Gly Ser Trp Glu 
115 120 125 

Leu Thr Asp Asp Cys Asp Lys Pro Cys Glu Phe Arg Glu Leu Arg Ser 
130 135 140 

Leu Tyr Leu Pro Ala Leu Leu Thr Cys Thr He Cys Tyr Lys Ala Met 
145 150 155 160 

Phe Arg He Val Cys Arg Tyr Leu Glu Phe Trp Glu Phe Glu Gin Cys 
165 170 175 

Phe His Ala Phe Leu Ala Val Leu Pro His Ser Leu Gin Pro Thr He 
180 185 190 

Tyr Gin Asn Tyr Phe Ala Leu Leu Glu Ser Leu Lys His Leu Ser Phe 
195 200 205 

Ser He Met Pro Pro Ala Ser Pro Asp Ala Gin Leu His Phe Leu Lys 
210 215 220 

Phe Asn He Ser Ser Phe Met Ala Thr Trp Gly Trp His Gly Glu Leu 
225 230 235 240 

Val Ser Leu Arg Arg Ala He Ala His Asn Val Glu Arg Leu Pro Thr 
245 250 255 

Val Leu Lys Asn Leu Ser Lys Gin Ser Lys His Gin Asp Val Lys Val 
260 265 270' 

Asn Gly Arg Asp Leu Val Gly Phe Gin Leu Ala Leu Asn Gin Leu Val 
275 280 285 

Ser Arg Leu His Val Lys He Gin Arg Lys Asp Pro Gly Pro Lys Pro 
290 295 300 

Tyr Arg Val Val Val Ser Thr Pro Asp Cys Thr Tyr Tyr Leu Val Tyr 
305 310 315 320 

Pro Gly Thr Pro Ala He Tyr Arg Leu Val Met Cys Met Ala Val Ala 
325 330 335 
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Asd Cys He Gly His Ser Cys Ser Glv Leu His Pro Cys Ala Asn Phe 
340 345 350 

Leu Gly Thr His Glu Thr Pro Arg Leu Leu Ala Ala Thr Leu Ser Arg 
355 360 365 

He Arg Tyr Ala Pro Lys Asp Arg Arg Ala Ala Met Lys Gly Asn Leu 
370 375 380 

Gin Ala Cys Phe Gin Arg Tyr Ala Ala Thr Asp Ala Arg Thr Leu Gly 
385 390 395 400 

Ser Ser Thr Val Ser Asp Met Leu Glu Pro Thr Lys His Val Ser Leu 
405 410 415 

Glu Asn Phe Lys He Thr He Phe Asn Thr Asn Met Val He Asn Thr 
420 425 430 

Lys He Ser Cys His Val Pro Asn Thr Leu Gin Lys Thr He Leu Asn 
435 440 445 

He Pro Arg Leu Thr Asn Asn Phe Val He Arg Lys Tyr Ser Val Lys 
450 455 460 

Glu Pro Ser Phe Thr He Ser Val Phe Phe Ser Asp Asn Met Cys Gin 
465 470 475 480 

Gly Thr Ala He Asn He Asn He Ser Gly Asp Met Leu His Phe Leu 
485 490 495 

Phe Ala Met Gly Thr Leu Lys Cys Phe Leu Pro He Arg His He Phe 
500 505 510 

Pro Val Ser He Ala Asn Trp Asn Ser Thr Leu Asp Leu His Gly Leu 
515 520 525 

Glu Asn Gin Tyr Met Val Arg Met Gly Arg Lys Asn Val Phe Trp Thr 
530 535 540 

Thr Asn Phe Pro Ser Val Val Ser Ser Lys Asp Gly Leu Asn Val Ser 
545 550 555 560 

Trp Phe Lvs Ala Ala Thr Ala Thr He Ser Lys Val Tyr Gly Gin Pro 
565 570 575 

Leu Val Glu Gin He Arg His Glu Leu Ala Pro He Leu Thr Asp Gin 
580 Se5 590 

His Ala Arg lie Asp Gly Asn Lys Asn Arg He Phe Ser Leu Leu Glu 
595 600 605 

His Arg Asn Arg Ser Gin He Gin Thr Leu His Lys Arg Phe Leu Glu 
610 615 620 

Cys Leu Val Glu Cys Cys Ser Phe Leu Arg Leu Asp Val Ala Cys He 
625 630 635 640 

Arg Arg Ala Ala Ala Arg Gly Leu Phe Asp Phe Ser Lys Lys He lie 
645 650 655 

Ser His Thr Lys Ser Lys His Glu Cys Ala Val Leu Gly Tyr Lys Lys 
660 665 670 

Cys Asn Leu He Pro Lys lie Tyr Ala Arg Asn Lys Lys Thr Arg Leu 
675 680 685 

Asp Glu Leu Gly Arg Asn Ala Asn Phe lie Ser Phe Val Ala Thr Thr 
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Gly His Arg Phe Ala Ala Leu Lys 
705 710 

Arg Lys Leu Gly Leu His Trp Arg 
725 

Gin Thr Pro Pro Ala Asp Pro Arg 
740 



212 



700 

Pro Gin lie Val Arg His Ala He 
715 720 

His Arg Thr Ala Ala Ser Asn Glu 
730 735 

Val Arg Cys Val Arg Pro Leu Val 
745 750 



(2) INFORMATION FOR SEQ ID NO; 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 364 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(iii) HYPOTHETICAL: N 
(iv) ANTI- SENSE: N 

( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . . 364 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

ATG GTA CGT CCA ACC GAG GCC GAG GTT AAG AAA TCC CTG AGC AGG CTT 4 8 

Met Val Arg Pro Thr Glu Ala Glu Val Lys Lys Ser Leu Ser Arg Leu 
15 10 is 

CCA GCA GCA CGC AAA AGA GCA GGT AAC CGG GCC CAC CTG GCC ACC TA r 96 
Pro Ala Ala Arg Lys Arg Ala Gly Asn Arg Ala His Leu Ala Thr Tyr 
20 25 30 

CGC CGG CTC CTC AAG TAC TCC ACC CTG CCC GAT CTA TGG CGG TTT CTA 14 4 

Arg Arg Leu Leu Lys Tyr Ser Thr Leu Pro Asp Leu Trp Arg Phe Leu 
35 40 45 

AGT AGC CGG CCC CAG AAC CCT CCC CTT GGA CAC CAC AGA TTA TTC TTT 192 
Ser Ser Arg Pro Gin Asn Pro Pro Leu Gly His His Arg Leu Phe Phe 
SO 55 60 

GAG GTG ACT CTA GGG CAC AGA ATT GCC GAC TGC GTA ATT CTG GTA TCG 24 0 

Glu Val Thr Leu Gly His Arg He Ala Asp Cys Val He Leu Val Ser 
65 7 0 75 80 

GGT GGG CAT CAG CCC GTA TGT TAC GTT GTA GAG CTC AAG ACT TGT CTG 288 
Gly Gly His Gin Pro Val Cys Tyr Val Val Glu Leu Lys Thr Cys Leu 
85 go 95 

AGT CAC CAG CTG ATC CCA ACC AAC ACC GTG AGA ACG TCA CAG CGA GCT 33 6 

Ser His Gin Leu He Pro Thr Asn Thr Val Arg Thr Ser Gin Arg Ala 
100 105 no 

CAA GGC CTG TGC CAA CTC TCC GAC TCG A 364 
Gin Gly Leu Cys Gin Leu Ser Asp Ser 
115 120 
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(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 121 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Met Val Arg Pro Thr Glu Ala Glu Val Lys Lys Ser Leu Ser Arg Leu 
15 10 15 

Pro Ala Ala Arg Lys Arg Ala Gly Asn Arg Ala His Leu Ala Thr Tyr 
20 25 30 

Arg Arg Leu Leu Lys Tyr Ser Thr Leu Pro Asp Leu Trp Arg Phe Leu 
35 40 45 

Ser Ser Arg Pro Gin Asn Pro Pro Leu Gly His His Arg Leu Phe Phe 
50 55 60 

Glu Val Thr Leu Gly His Arg lie Ala Asp Cys Val lie Leu Val Ser 
65 70 75 80 

Gly Gly His Gin Pro Val Cys Tyr Val Val Glu Leu Lys Thr Cys Leu 
85 90 95 

Ser His Gin Leu lie Pro Thr Asn Thr Val Arg Thr Ser Gin Arg Ala 
100 105 110 

Gin Gly Leu Cvs Gin Leu Ser Asp Ser 
115 " 120 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 918 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA {genomic) 
(iii) HYPOTHETICAL: N 
(iv) ANTI-SENSE: N 

( ix ) FEATURE : 

(A) NAME / KEY : CDS 

(B) LOCATION: 1 . . 91B 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

ATG GCA CTC GAC AAG AGT ATA GTG GTT AAC TTC ACC TCC AGA CTC TTC 4 8 

Met Ala Leu Asp Lys Ser lie Val Val Asn Phe Thr Ser Arg Leu Phe 
15 10 15 

GCT GAT GAA CTG GCC GCC CTT CAG TCA AAA ATA GGG AGC GTA CTG CCZ 9 6 

Ala Asp Glu Leu Ala Ala Leu Gin Ser Lys lie Gly Ser Val Leu Pro 
20 25 30 
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CTC GGA GAT TGC CAC CGT TTA CAA AAT ATA CAG GCA TTG GGC CTG GGG 144 
Leu Gly Asp Cys His Arg Leu Gin Asn He Gin Ala Leu Glv Leu Glv 
35 40 45 

TGC GTA TGC TCA CGT GAG ACA TCT CCG GAC TAC ATC CAA ATT ATG CAG 192 
Cys Val Cys Ser Arg Glu Thr Ser Pro Asp Tyr He Gin He Met Glr 
50 55 60 

TAT CTA TCC AAG TGC ACA CTC GCT GTC CTG GAG GAG GTT CGC CCG GAC 24 0 

Tyr Leu Ser Lys Cys Thr Leu Ala Val Leu Glu Glu Val Arg Pro Asp 
65 70 75 80 

AGC CTG CGC CTA ACG CGG ATG GAT CCC TCT GAC AAC CTT CAG ATA AAA 288 
Ser Leu Arg Leu Thr Arg Met Asp Pro Ser Asp Asn Leu Gin lie Lys 
65 90 95 

AAC GTA TAT GCC CCC TTT TTT CAG TGG GAC AGC AAC ACC CAG CTA GCA 33 6 

Asn Val Tyr Ala Pro Phe Phe Gin Trp Asp Ser Asn Thr Gin Leu Ala 
100 105 no 

GTG CTA CCC CCA TTT TTT AGC CGA AAG GAT TCC ACC ATT GTG CTC GAA 3 84 

Val Leu Pro Pro Phe Phe Ser Arg Lys Asp Ser Thr He Val Leu Glu 
115 120 125 

TCC AAC GGA TTT GAC CCC GTG TTC CCC ATG GTC GTG CCG CAG CAA CTG 43 2 

Ser Asn Gly Phe Asp Pro Val Phe Pro Met Val Val Pro Gin Gin Leu 
130 135 i 4 o 

GGG CAC GCT ATT CTG CAG CAG CTG TTG GTG TAC CAC ATC TAC TCC AAA 48 0 

Gly His Ala He Leu Gin Gin Leu Leu Val Tyr His He Tyr Ser Lys 
145 150 155 160 

ATA TCG GCC GGG GCC CCG GAT GAT GTA AAT ATG GCG GAA CTT GAT CTA 52 8 

He Ser Ala Gly Ala Pro Asp Asp Val Asn Met Ala Glu Leu Asp Leu 
165 170 175 

TAT ACC ACC AAT GTG TCA TTT ATG GGG CGC ACA TAT CGT CTG GAC GTA 5 76 

Tyr Thr Thr Asn Val Ser Phe Met Gly Arg Thr Tyr Arg Leu Asp Val 
180 165 190 

GAC AAC ACG GAT CCA CGT ACT GCC CTG CGA GTG CTT GAC GAT CTG TCC 624 
Asp Asn Thr Asp Pro Arg Thr Ala Leu Arg Val Leu Asp Asp Leu Se- 
195 200 205 

ATG TAC CTT TGT ATC CTA TCA GCC TTG GTT CCC AGG GGG TGT CTC CGT 6 72 

Met Tyr Leu Cys He Leu Ser Ala Leu Val Pro Arg Glv Cys Leu Arc? 
210 215 220 * 

CTG CTC ACG GCG CTC GTG CGG CAC GAC AGG CAT CCT CTG ACA GAG GTG 72 0 

Leu Leu Thr Ala Leu Val Arg His Asp Arg His Pro Leu Thr Glu Va"> 
225 230 235 240 

TTT GAG GGG GTG GTG CCA GAT GAG GTG ACC AGG ATA GAT CTC GAC CAG 76 8 

Phe Glu Gly Val Val Pro Asp Glu Val Thr Arg He Asp Leu Aso Gin 
245 250 255 

TTG AGC GTC CCA GAT GAC ATC ACC AGG ATG CGC GTC ATG TTC TCC TAT 816 
Leu Ser Val Pro Asp Asp He Thr Arg Met Arg Val Met Phe Ser Tyr 
260 265 270 

CTT CAG AGT CTC AGT TCT ATA TTT AAT CTT GGC CCC AGA CTG CAC GTG 86 4 

Leu Gin Ser Leu Ser Ser He Phe Asn Leu Gly Pro Arg Leu His Val 
275 280 285 

TAT GCC TAC TCG GCA GAG ACT TTG GCG GCC TCC TGT TGG TAT TCC CCA 912 
Tyr Ala Tyr Ser Ala Glu Thr Leu Ala Ala Ser Cys Trp Tvr Ser Pro 
290 295 300 
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CGC TAA 
Arg 

305 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 05 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Met Ala Leu Asp Lys Ser He Val Val Asn Phe Thr Ser Arg Leu Phe 
1 5 10 15 

Ala Asp Glu Leu Ala Ala Leu Gin Ser Lys He Gly Ser Val Leu Pro 
20 25 30 

Leu Gly Asp Cys His Arg Leu Gin Asn He Gin Ala Leu Gly Leu Gly 
35 40 45 

Cys Val Cys Ser Arg Glu Thr Ser Pro Asp Tyr He Gin He Met Gin 
50 55 60 

Tyr Leu Ser Lvs Cys Thr Leu Ala Val Leu Glu Glu Val Arg Pro Asp 
65 * 70 75 80 

Ser Leu Arg Leu Thr Arg Met Asp Pro Ser Asp Asn Leu Gin He Lys 
85 90 95 

Asn Val Tvr Ala Pro Phe Phe Gin Trp Asp Ser Asn Thr Gin Leu Ala 
100 105 HO 

Val Leu Pro Pro Phe Phe Ser Arg Lys Asp Ser Thr lie Val Leu Glu 
115 120 125 

Ser Asn Glv Phe Asp Pro Val Phe Pro Met Val Val Pro Gin Gin Leu 
130 * 135 140 

Gly His Ala He Leu Gin Gin Leu Leu Val Tyr His lie Tyr Ser Lys 
145 150 155 160 

He Ser Ala Gly Ala Pro Asp Asp Val Asn Met Ala Glu Leu Asp Leu 
165 170 175 

Tyr Thr Thr Asn Val Ser Phe Met Gly Arg Thr Tyr Arg Leu Asp Val 
180 165 190 

Asp Asn Thr Asp Pro Arg Thr Ala Leu Arg Val Leu Asp Asp Leu Ser 
195 200 205 

Met Tyr Leu Cys He Leu Ser Ala Leu Val Pro Arg Gly Cys Leu Arg 
210 215 220 

Leu Leu Thr Ala Leu Val Ara His Asp Arg His Pro Leu Thr Glu Val 
225 230 " 235 240 

Phe Glu Gly Val Val Pro Asp Glu Val Thr Arg lie Asp Leu Asp Gin 
245 250 255 

Leu Ser Val Pro Asp Asp He Thr Arg Met Arg Val Met Phe Ser Tyr 
260 265 270 



915 
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Leu Gin Ser Leu Ser Ser lie Phe Asn Leu Gly Pro Arg Leu His Val 
275 280 2S5 

Tyr Ala Tyr Ser Ala Glu Thr Leu Ala Ala Ser Cys Trp Tvr Ser Pro 
290 295 300 

Arg 
305 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 73 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(iii) HYPOTHETICAL: N 
(iv) ANTI- SENSE: N 

( ix ) FEATURE : 

(A) NAME /KEY: CDS 

(B) LOCATION: 1 . .8 73 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

ATG GCG TCA TCT GAT ATT CTG TCG GTT GCA AGG ACG GAT GAC GGC TCC 4 8 

Met Ala Ser Ser Asp He Leu Ser Val Ala Arg Thr Asp Asp Gly Se- 
15 10 is 

GTC TGT GAA GTC TCC CTG CGT GGA GGT AGG AAA AAA ACT ACC GTC TAC 96 
Val Cys Glu Val Ser Leu Arg Gly Gly Arg Lys Lys Thr T*- Va^ Tv- 
20 25 30 

CTG CCG GAC ACT GAA CCC TGG GTG GTA GAG ACC GAC GCC ATC AAA GA~ 144 
Leu Pro Asp Thr Glu Pro Trp Val Val Glu Thr Asp Ala He Lys Asp 
35 40 45 

GCC TTC CTC AGC GAC GGG ATC GTG GAT ATG GCT CGA AAG CTT CAT CGT 192 
Ala Phe Leu Ser Asp Gly He Val Asp Met Ala Arg Lys Leu His Arq 
50 55 60 

GGT GCC CTG CCC TCA AAT TCT CAC AAC GGC TTG AGG ATG GTG CT^ ^ 24 0 

Gly Ala Leu Pro Ser Asn Ser His Asn Gly Leu Arg Met Val Leu Phe 
65 70 75 80 

TGT TAT TGT TAC TTG CAA AAT TGT GTG TAC CTA GCC CTG TTT CTG TGC 2 88 

Cys Tyr Cys Tyr Leu Gin Asn Cys Val Tyr Leu Ala Leu Phe Leu Cys 
85 90 95 

CCC CTT AAT CCT TAC TTG GTA ACT CCC TCA AGC ATT GAG TTT GCC GAG 3 36 

Pro Leu Asn Pro Tyr Leu Val Thr Pro Ser Ser He Glu Phe Ala Glu 
100 105 no 

CCC GTT GTG GCA CCT GAG GTG CTC TTC CCA CAC CCG GCT GAG ATG TCT 3 84 

Pro Val Val Ala Pro Glu Val Leu Phe Pro His Pro Ala Glu Met Se- 
lls 120 125 

CGC GGT TGC GAT GAC GCG ATT TTC TGT AAA CTG CCC TAT ACC GTG CC 4 32 

Arg Gly Cys Asp Asp Ala He Phe Cys Lys Leu Pro Tyr Thr Val Pro 
130 135 140 
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ATA ATC AAC ACC ACG TTT GGA CGC ATT TAC CCG AAC TCT ACA CGC GAG 4 80 

He He Asn Thr Thr Phe Gly Arg He Tyr Pro Asn Ser Thr Arg Glu 
145 150 155 I6C 

CCG GAC GGC AGG CCT ACG GAT TAC TCC ATG GCC CTT AGA AGG GCT TTT 52 E 

Pro Asp Gly Arg Pro Thr Asp Tyr Ser Met Ala Leu Arg Arg Ala Pne 
165 170 175 

GCA GTT ATG GTT AAC ACG TCA TGT GCA GGA GTG ACA TTG TGC CGC GGA 5 76 

Ala Val Met Val Asn Thr Ser Cys Ala Gly Val Thr Leu . Cys Arg Gly 
1B0 1B5 190 

GAA ACT CAG ACC GCA TCC CGT AAC CAC ACT GAG TGG GAA AAT CTG CTG 624 
G^u Thr Gin Thr Ala Ser Arg Asn His Thr Glu Trp Glu Asn Leu Leu 
195 200 205 

GCT ATG TTT -TCT GTG ATT ATC TAT GCC TTA GAT CAC AAC TGT CAC CCG 672 
Ala Met Phe Ser Val He He Tyr Ala Leu Asp His Asn Cys His Pro 
210 215 220 

GAA GCA CTG TCT ATC GCG AGC GGC ATC TTT GAC GAG CGT GAC TAT GGA 72 0 

Glu Ala Leu Ser He Ala Ser Gly He Phe Asp Glu Arg Asp Tyr Gly 
225 230 235 240 

TTA TTC ATC TCT CAG CCC CGG AGC GTG CCC TCG CCT ACC CCT TGC GAC 768 
Leu Phe He Ser Gin Pro Arg Ser Val Pro Ser Pro Thr Pro Cys Asp 
245 250 255 

GTG TCG TGG GAA GAT ATC TAC AAC GGG ACT TAC CTA GCT CGG CCT GGA 816 
Val Ser Trp Glu Asp He Tyr Asn Gly Thr Tyr Leu Ala Arg Pro Gly 
260 265 270 



AAC TGT GAC CCC TGG CCC AAT CTA TCC ACC CCT CCC TTG ATT CTA AAT 
Asn Cys Asp Pro Trp Pro Asn Leu Ser Thr Pro Pro Leu He Leu Asn 
275 2B0 2B5 

TTT AAA TAA 
Phe Lys 
290 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 290 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Met Ala Ser Ser Asp He Leu Ser Val Ala Arg Thr Asp Asp Gly Ser 

1 5 10 - 15 

Val Cys Glu Val Ser Leu Arg Gly Gly Arg Lys Lys Thr Thr Val Tyr 
20 25 30 

Leu Pro Asp Thr Glu Pro Trp Val Val Glu Thr Asp Ala He Lys Asp 
35 40 45 

Ala Phe Leu Ser Asp Gly He Val Asp Met Ala Arg Lys Leu His Arg 
50 55 60 

Gly Ala Leu Pro Ser Asn Ser His Asn Gly Leu Arg Met: Val Leu Phe 
65 70 75 80 



864 



873 
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Cys Tyr Cys Tyr Leu Gin Asn Cys Val Tvr Leu Ala Leu Phe Leu Cvs 
85 90 95 

Pro Leu Asn Pro Tyr Leu Val Thr Pro Ser Ser He Glu Phe Ala Glu 
100 105 no 

Pro Val Val Ala Pro Glu Val Leu Phe Pro His Pro Ala Glu Met Ser 
115 120 125 

Arg Gly Cys Asp Asp Ala He Phe Cys Lys Leu Pro TVr Thr Val Pro 
130 135 140 

He He Asn Thr Thr Phe Gly Arg He Tyr Pro Asn Ser Thr Arg Glu 
145 150 155 160 

Pro Asp Gly Arg Pro Thr Asp Tyr Ser Met Ala Leu Arg Arg Ala Phe 
165 170 175 

Ala Val Met Val Asn Thr Ser Cys Ala Gly Val Thr Leu Cys Arg Gly 
180 185 190 

Glu Thr Gin Thr Ala Ser Arg Asn His Thr Glu Trp Glu Asn Leu Leu 
195 200 205 

Ala Met Phe Ser Val He lie Tyr Ala Leu Asp His Asn Cys His Pro 
210 215 220 

Glu Ala Leu Ser He Ala Ser Gly He Phe Asp Glu Arg Asp Tyr Gly 
225 230 235 240 

Leu Phe He Ser Gin Pro Arg Ser Val Pro Ser Pro Thr Pro Cys Asp 
245 250 255 

Val Ser Trp Glu Asp He Tyr Asn Gly Thr TVr Leu Ala Arg Pro Gly 
260 265 270 

Asn Cys Asp Pro Trp Pro Asn Leu Ser Thr Pro Pro Leu He Leu Asn 
275 280 285 

Phe Lys 
290 

(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 363 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: N 
(iv) ANTI- SENSE: N 

( ix ) FEATURE : 

(A) NAME / KEY : CDS 

(B) LOCATION: 1. .3 63 
(D> OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

ATG AGC ATG ACT TTC CCC GTC TCC AGT CAC CGG AGG AAT GGT GGA CGG 

Met Ser Met Thr Phe Pro Val Ser Ser His Arg Arg Asn Gly Gly Arg 

15 10 15 
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CTC CGT CCT GGT GCG AAT GGC CAC CAA GCC TCC CGT GAT TGG TCT TAT 5 c 

Leu Arg Pro Glv Ala Asn Gly His Gin Ala Ser Arg Asp Trp Ser Tyr 
20 25 30 

AAC AGT GCT CTT CCT CCT AGT CAT AGG CGC CTG CGT CTA CTG CTG CAT 14 4 

Asn Ser Ala Leu Pro Pro Ser His Arg Arg Leu Arg Leu Leu Leu His 
35 40 45 

TCG CGT GTT CCT GGC GGC TCG ACT GTG GCG CGC CAC CCC ACT AGG CAG 192 
Ser Arg Val Pro Gly Gly Ser Thr Val Ala Arg His Pro Thr Arg Gin 
50 55 60 

GGC CAC CGT GGC GTA TCA GGT CCT TCG CAC CCT GGG ACC GCA GGC CGG 24 0 

Gly His Arg Gly Val Ser Gly Pro Ser His Pro Gly Thr Ala Gly Arg 
65 70 75 80 

GTC ACA TGC ACC GCC GAC GGT GGG CAT AGC TAC CCA GGA GCC CTA CCG 28 8 

Val Thr Cys Thr Ala Asp Gly Gly His Ser Tyr Pro Gly Ala Leu Pro 
85 90 95 

TAC AAT ATA CAT GCC AGA TTA GAA CGG GGT GTG TGC TAT AAT GGA TGG 3 36 

Tyr Asn He His Ala Arg Leu Glu Arg Gly Val Cys Tyr Asn Gly Trp 
100 105 110 

CTA TGG GGG GGG GCT GTA GAT AAT TGA 36 3 

Leu Trp Gly Gly Ala Val Asp Asn 
115 120 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 0 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Met Ser Met Thr Phe Pro Val Ser Ser His Arg Arg Asn Gly Giy Arg 
15 10 15 

Leu Arg Pro Gly Ala Asn Gly His Gin Ala Ser Arg Asp Trp Ser Tyr 
20 25 30 

Asn Ser Ala Leu Pro Pro Ser His Arg Arg Leu Arg Leu Leu Leu His 
35 40 45 

Ser Arg Val Pro Gly Gly Ser Thr Val Ala Arg His Pro Thr Arg Gin 
50 55 60 

Gly His Arg Gly Val Ser Gly Pro Ser His Pro Gly Thr Ala Gly Arg 
65 70 75 80 

Val Thr Cys Thr Ala Asp Gly Gly His Ser Tyr Pro Gly Ala Leu Pro 
85 90 95 

Tyr Asn He His Ala Arg Leu Glu Arg Gly Val Cys Tyr Asr. Gly Trp 
100 105 HO 

Leu Trp Gly Gly Ala Val Asp Asn 
115 120 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS : 
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(A) LENGTH : 921 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: N 
(iv) ANTI-SENSE: N 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1 . . 921 
(D) OTHER INFORMATION: 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO:30: 

ATG CTG CTC AGC CGT CAC AGG GAG CGC CTT GCC GCC AAC CTG GAG GAG 
Met Leu Leu Ser Arg His Arg Glu Arg Leu Ala Ala Asn Leu Glu Glu 

ACQ GCC AAA GAC GCC GGA GAG AGG TGG GAA CTG AGT GCC CCG ACA TTC 
Thr Ala Lys Asp Ala Gly Glu Arg Trp Glu Leu Ser Ala Pro Thr Phe 
20 25 30 

Th? fS* S AC n GT S CC *** ACG GCA CGG ATG GCG «C CCT TTT ATT GGC 
Thr Arg His Cys Pro Lys Thr Ala Arg Met Ala His Pro Phe lie Gly 
^ 5 4 0 45 

vlf vlt Hif tT A ^ I CA AGT TCG GTC CTG GAA ACA TAG TGC 

Val Val His Arg lie Asn Ser Tyr Ser Ser Val Leu Glu Thr Tyr Cys 

iU 55 60 

ACA CGG CAC CAT CCC GCC ACG CCC ACG TCA GCA AAT CCC GTG rrt 

Thr Arg His His Pro Ala Thr Pro Thr Ser A^ i£ Pro A^ vll Glv 

70 75 80 

ACC CCC AGA CCG TCC GAG GAC AAC GTC CCC GCA AAG CCG CGC CTA TTG 
Thr Pro Arg Pro Ser Glu Asp Asn Val Pro Ala Lys Pro Arg Leu Leu 
85 90 95 

I~ f TA I CA A 5* TAC ^ ^ ATG CGG TGT GTG CGC CAG GAC GCG 
Glu Ser Leu Ser Thr Tyr Leu Gin Met Arg Cys Val Arg Glu Asp Ala 

100 105 110 

CAC GTC TCC ACG GCC GAT CAA CTG GTC GAG TAC CAG GCG GGC AGA AAA 
His Val Ser Thr Ala Asp Gin Leu Val Glu Tyr Gin Ala Gly Arg Lys 
US 120 12 5 

ACA CAC GAC TCC CTG CAC GCC TGC TCT GTC TAC CGC GAA CTT CAG GCT 
Thr His Asp Ser Leu His Ala Cys Ser Val Tyr Arg Glu Leu Gin Ala 
iJU m 135 140 

TTT CTG GTT AAC CTT TCG TCC TTT CTG AAC GGC TGT TAC GTT CCC GGG 
Phe Leu Val Asn Leu Ser Ser Phe Leu Asn Gly Cys Tyr Val Pro Gly 
145 150 155 160 

GTG CAC TGG CTG GAG CCC TTC CAA CAG CAG CTA GTA ATG CAC AC™ ~ 
Val His Trp Leu Glu Pro Phe Gin Gin Gin Leu Val Met His Thr Phe 
165 170 175 

< TTC TTT TTG GTT TCA ATC AAG GCC CCA CAA AAG ACG CAC CAG TTG 
Phe Phe Leu Val Ser He Lys Ala Pro Gin Lys Thr His Gin Leu Phe 
180 185 i 9 o 



46 



96 



144 



192 



240 



288 



336 



384 



432 



4S0 



528 



576 
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GGA TTG TTT AAG CAG TAC TTC GGT TTA TTT GAA ACT CCA A~C AGT G. , 

Glv Leu Phe Lvs Gin Tvr Phe Gly Leu Phe Glu Thr Pre Asn Ser Va, 

195 " * 200 205 

TA CAG ACG TTT AAG CAA AAG GCA AGC GTA TTC CTA ATA CCA AGG AG A 6 72 

Leu Gin Thr Phe Lys Gin Lys Ala Ser Val Phe Leu lie Pro Arg Arg 
210 215 220 

CAC GGA AAG ACA TGG ATA GTG GTG GCG ATC ATC AGC ATG CTA CTG GCA 72 0 

His Gly Lys Thr Trp He Val Val Ala He. He Ser Met. Leu Leu Ala 
225 230 235 240 

TCC GTA GAG AAC ATT AAC ATT GGG TAC GTA GCC CAC CAA AAG CAC GTA 76 8 

Ser Val Glu Asn He Asn He Gly Tyr Val Ala His Gin Lys His Val 
245 250 255 

GCC AAC TCC GTG TTC GCG GAA ATC ATA AAG ACG CTT TGT CGG TGG TTC 816 

Ala Asn Ser Val Phe Ala Glu He He Lys Thr Leu Cys Arg Trp Phe 

260 265 270 

CCC CCC AAA AAT TTA AAC ATC AAG AAG GAG AAC GGA ACC ATA ATC TAC 864 

Pro Pro Lys Asn Leu Asn He Lys Lys Glu Asn Gly Thr He He Tyr 

275 280 285 

ACG CGA CCC GGA GGA CGG TCC AGC TCG CTG ATG TGC GCA ACA TGC TTC 912 

Thr Arg Pro Gly Gly Arg Ser Ser Ser Leu Met Cys Ala Thr Cys Phe 
290 295 300 



AAT AAG AAC 
Asn Lys Asn 
305 

£2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

• (A) LENGTH: 307 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

Met Leu Leu Ser Arg His Arg Glu Arg Leu Ala Ala Asn Leu Glu Glu 
15 10 15 

Th^ Ala Lvs Asp Ala Gly Glu Arg Trp Glu Leu Ser Ala Pro Thr Phe 
20 25 30 

Thr Arg His Cys Pro Lys Thr Ala Arg Met Ala His Pro Phe He Gly 
35 40 45 

Val Val His Arg He Asn Ser Tyr Ser Ser Val Leu Glu Thr Tyr Cys 
50 55 60 

Thr Arq His His Pro Ala Thr Pro Thr Ser Ala Asn Pro Asp Val Gly 
65 70 75 80 

Thr Pro Arg Pro Ser Glu Asp Asn Val Pro Ala Lys Pro Arg Leu Leu 
B5 90 95 

Glu Ser Leu Ser Thr Tyr Leu Gin Met Arg Cys Val Arg Glu Asp Ala 
100 105 HO 

His Val Ser Thr Ala Asp Gin Leu Val Glu Tyr Gin Ala Gly Arg Lys 
115 120 125 



921 
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Thr His Asp Ser Leu His Ala Cys Ser Val Tyr Arg Glu Leu Gin Ala 
130 135 140 

Phe Leu Val Asn Leu Ser Ser Phe Leu Asn Glv Cys Tvr Val Pro Glv 
14 5 150 155 ' 160 

Val His Trp Leu Glu Pro Phe Gin Gin Gin Leu Val Met His Thr Phe 
165 170 175 

Phe Phe Leu Val Ser He Lys Ala Pro Gin Lys Thr His Gin Leu Phe 
180 185 190 

Gly Leu Phe Lys Gin Tyr Phe Gly Leu Phe Glu Thr Pro Asn Ser Val 
195 200 205 

Leu Gin Thr Phe Lys Gin Lys Ala Ser Val Phe Leu He Pro Arg Arg 
210 215 220 

His Gly Lys Thr Trp He Val Val Ala He He Ser Met Leu Leu Ala 
225 230 235 240 

Ser Val Glu Asn He Asn He Gly Tyr Val Ala His Gin Lys His Val 
245 250 255 

Ala Asn Ser Val Phe Ala Glu He He Lys Thr Leu Cys Arg Trp Phe 
260 265 270 

Pro Pro Lys Asn Leu Asn He Lys Lys Glu Asn Gly Thr He He Tyr 
275 280 285 ' 

Thr Arg Pro Gly Gly Arg Ser Ser Ser Leu Met Cys Ala Thr Cys Phe 
290 295 300 

Asn Lys Asn 
305 

(2) INFORMATION FOR SEQ ID NO: 32: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 136 5 base pairs 

(B) TYPE: nucleic acid 
tC) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(iii) HYPOTHETICAL: N 

(iv) ANTI-SENSE: N 

(ix) FEATURE : 

(A) NAME / KEY : CDS 

(B) LOCATION: 1 . . 136 5 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

ATG GAT GCG CAT GCT ATC AAC GAA AGA TAC GTA GGT CCT CGC TGC CAC 4 8 

Met Asp Ala His Ala He Asn Glu Arg Tyr Val Gly Pro Arg Cys His 
15 10 15 

CGT TTG GCC CAC GTG GTG CTG CCT AGG ACC TTT CTG CTG CAT CAC GCC 96 
Arg Leu Ala His Val Val Leu Pro Arg Thr Phe Leu Leu His His Ala 
20 25 30 

ATA CCC CTG GAG CCC GAG ATC ATC TTT TCC ACC TAC ACC CGG TTC AGC 144 
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He Pro Leu Glu Pro Glu He He Phe Ser Thr Tyr Thr Arg Phe Ser 
35 40 45 

CGG TCG CCA GGG TCA TCC CGC CGG TTG GTG GTG TGT GGG AAA CGT GTC 192 
Arg Ser Pro Gly Ser Ser Arg Arg Leu Val Val Cys Gly Lys Arg Val 
50 55 60 

CTG CCA GGG GAG GAA AAC CAA CTT GCG TCT TCA CCT TCT GGT TTG GCG 24 0 

Leu Pro Gly Glu Glu Asn Gin Leu Ala Ser Ser Pro Ser Gly Leu Ala 
65 70 75 80 

CTT AGC CTG CCT CTG TTT TCC CAC GAT GGG AAC TTT CAT CCA TTT GAC 2 88 

Leu Ser Leu Pro Leu Phe Ser His Asp Gly Asn Phe His Pro Phe Asp 
85 90 95 

ATC TCG GTA CTG CGC ATT TCC TGC CCT GGT TCT AAT CTT AGT CTT ACT 3 36 

He Ser Val Leu Arg He Ser Cys Pro Gly Ser Asn Leu Ser Leu Thr 
100 105 110 

GTC AGA TTT CTC TAT CTA TCT CTG GTG GTG GCT ATG GGG GCG GGA CGG 3 84 

Val Arg Phe Leu Tyr Leu Ser Leu Val Val Ala Met Gly Ala Gly Arg 
115 120 125 

AAT AAT GCG CGG AGT CCG ACC GTT GAC GGG GTA TCG CCG CCA GAG GGC 432 
Asn Asn Ala Arg Ser Pro Thr Val Asp Gly Val Ser Pro Pro Glu Gly 
130 135 140 

GCC GTA GCC CAC CCT TTG GAG GAA CTG CAG AGG CTG GCG CGT GCT ACG 480 
Ala Val Ala His Pro Leu Glu Glu Leu Gin Arg Leu Ala Arg Ala Thr 
145 150 155 160 

CCG GAC CCG GCA CTC ACC CGT GGA CCG TTG CAG GTC CTG ACC GGC CTT 52 8 

Pro Asp Pro Ala Leu Thr Arg Gly Pro Leu Gin Val Leu Thr Gly Leu 
165 170 175 

CTC CGC GCA GGG TCA GAC GGA GAC CGC GCC ACT CAC CAC ATG GCG CTC 576 
Leu Arg Ala Gly Ser Asp Gly Asp Arg Ala Thr His His Met Ala Leu 
180 185 190 

GAG GCT CCG GGA ACC GTG CGT GGA GAA AGC CTA GAC CCG CCT GTT TCA 62 4 

Glu Ala Pro Gly Thr Val Arg Gly Glu Ser Leu Asp Pro Pro Val Ser 
195 200 205 

CAG AAG GGG CCA GCG CGC ACA CGC CAC AGG CCA CCC CCC GTG CGA CTG 6 72 

Gin Lys Gly Pro Ala Arg Thr Arg His Arg Pro Pro Pro Val Arg Leu 
210 215 220 

AGC TTC AAC CCC GTC AAT GCC GAT GTA CCC GCT ACC TGG CGA GAC GCC 72 0 

Ser Phe Asn Pro Val Asn Ala Asp Val Pro Ala Thr Trp Arg Asp Ala 
225 230 235 240 

ACT AAC GTG TAC TCG GGT GCT CCC TAC TAT GTG TGT GTT TAC GAA CGC 76 8 

Thr Asn Val Tyr Ser Gly Ala Pro Tyr Tyr Val Cys Val Tyr Glu Arg 
245 250 ■ 255 

GGT GGC CGT CAG GAA GAC GAC TGG CTG CCG ATA CCA CTG AGC TTC CCA 816 
Gly Gly Arg Gin Glu Asp Asp Trp Leu Pro lie Pro Leu Ser Phe Pro 
260 265 270 

GAA GAG CCC GTG CCC CCG CCA CCG GGC TTA GTG TTC ATG GAC GAC TTG 86 4 

Glu Glu Pro Val Pro Pro Pro Pro Gly Leu Val Phe Met Asp Asp Leu 
275 280 285 

TTC ATT AAC ACG AAG CAG TGC GAC TTT GTG GAC ACG CTA GAG GCC GCC 912 
Phe He Asn Thr Lys Gin Cys Asp Phe Val Asp Thr Leu Glu Ala Ala 
290 295 300 
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TGT CGC ACG CAA GGC TAC ACG TTG AGA CAG CGC GTG CCT GTC GCC ATT 9c 0 

Cys Axg Thr Gin Gly Tyr Thr Leu Arg Gin Arg Val Pro Val Ala lie 

305 310 315 320 

CCT CGC GAC GCG GAA ATC GCA GAC GCA GTT AAA TCG CAC TTT TTA GAG 1008 
Pro Arg Asp Ala Glu lie Ala Asp Ala Val Lys Ser His Phe Leu Glu 
325 330 335 

GCG TGC CTA GTG TTA CGG GGG CTG GCT TCG GAG GCT AGT GCC TGG ATA 1056 
Ala Cys Leu Val Leu Arg Gly Leu Ala Ser Glu Ala Ser Ala Trp lie 
340 345 350 

AGA GCT GCC ACG TCC CCG CCC CTT GGC CGC CAC GCC TGC TGG ATG GAC 1104 
Arg Ala Ala Thr Ser Pro Pro Leu Gly Arg His Ala Cys Trp Met Asp 
355 360 365 

GTG TTA GGA TTA TGG GAA AGC CGC CCC CAC ACT CTA GGT TTG GAG TTA 1152 
Val Leu Gly Leu Trp Glu Ser Arg Pro His Thr Leu Gly Leu Glu Leu 
370 375 380 

CGC GGC GTA AAC TGT GGC GGC ACG GAC GGT GAC TGG TTA GAG ATT TTA 12 0 0 

Arg Gly Val Asn Cys Gly Gly Thr Asp Gly Asp Trp Leu Glu He Leu 
385 390 395 400 

AAA CAG CCC GAT GTG CAA AAG ACA GTC AGC GGG AGT CTT GTG GCA TGC 124 8 

Lys Gin Pro Asp Val Gin Lys Thr Val Ser Gly Ser Leu Val Ala Cys 
405 410 415 

GTG ATC GTC ACA CCC GCA TTG GAA GCC TGG CTT GTG TTA CCT GGG GGT 1296 
Val He Val Thr Pro Ala Leu Glu Ala Trp Leu Val Leu Pro Gly Gly 
420 425 430 

TTT GCT ATT AAA GCC CGC TAT AGG GCG TCG AAG GAG GAT CTG GTG TTC 1344 
Phe Ala He Lys Ala Arg Tyr Arg Ala Ser Lys Glu Asp Leu Val Phe 
435 440 445 



ATT CGA GGC CGC TAT GGC TAG 
He Arg Gly Arg Tyr Gly 
450 



1365 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 4 54 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 

Met Asp Ala His Ala He Asn Glu Arg Tyr Val Glv Pro Arg Cys His 
1 5 10 ' 15 

Arg Leu Ala His Val Val Leu Pro Arg Thr Phe Leu Leu His His Ala 
20 25 30 

He Pro Leu Glu Pro Glu He He Phe Ser Thr Tyr Thr Arg Phe Ser 
35 40 45 

Arg Ser Pro Gly Ser Ser Arg Arg Leu Val Val Cys Gly Lys Arg Val 
50 55 60 

Leu Pro Gly Glu Glu Asn Gin Leu Ala Ser Ser Pro Ser Gly Leu Ala 
65 70 75 80 
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Leu Ser Leu Pro Leu Phe Ser His Asd Gly Asn Phe His Pro Phe Asp 
85 90 95 

lie Ser Val Leu Arg lie Ser Cvs Pro Gly Ser Asn Leu Ser Leu Thr 
100 * 105 HO 

Val Arg Phe Leu Tyr Leu Ser Leu Val Val Ala Met Gly Ala Gly Arg 
115 120 125 

Asn Asn Ala Arg Ser Pro Thr Val Asp Gly Val Ser Pro Pro Glu Gly 
130 135 140 

Ala Val Ala His Pro Leu Glu Glu Leu Gin Arg Leu Ala Arg Ala Thr 
145 150 155 160 

Pro Asp Pro Ala Leu Thr Arg Gly Pro Leu Gin Val Leu Thr Gly Leu 
165 170 175 

Leu Ara Ala Gly Ser Asp Gly Asp Arg Ala Thr His His Met Ala Leu 
180 185 190 

Glu Ala Pro Gly Thr Val Arg Gly Glu Ser Leu Asp Pro Pro Val Ser 
195 200 205 

Gin Lys Gly Pro Ala Arg Thr Arg His Arg Pro Pro Pro Val Arg Leu 
210 215 220 

Ser Phe Asn Pro Val Asn Ala Asp Val Pro Ala Thr Trp Arg Asp Ala 
225 230 235 240 

Thr Asn Val Tyr Ser Gly Ala Pro Tyr Tvr Val Cys Val Tyr Glu Arg 
245 250 255 

Gly Glv Arg Gin Glu Asp Asp Trp Leu Pro He Pro Leu Ser Phe Pro 
260 265 270 

Glu Glu Pro Val Pro Pro Pro Pro Gly Leu Val Phe Met Asp Asp Leu 
275 280 285 

Phe He Asn Thr Lys Gin Cys Asp Phe Val Asp Thr Leu Glu Ala Ala 
290 295 . 300 

Cys Arg Thr Gin Gly Tvr Thr Leu Arg Gin Arg Val Pro Val Ala He 
305 310 315 320 

Pro Arg Asp Ala Glu He Ala Asd Ala Val Lys Ser His Phe Leu Glu 
325 330 335 

Ala Cvs Leu Val Leu Arg Gly Leu Ala Ser Glu Ala Ser Ala Trp He 
340 345 350 

Arg Ala Ala Thr Ser Pro Pro Leu Gly Arg His Ala Cys Trp Met Asp 
355 360 365 

Val Leu Gly Leu Trp Glu Ser Arg Pro His Thr Leu Gly Leu Glu Leu 
370 375 380 

Arg Gly Val Asn Cys Gly Gly Thr Asp Gly Asp Trp Leu Glu He Leu 
385 390 395 400 

Lys Gin Pro Asp Val Gin Lys Thr Val Ser Gly Ser Leu Val Ala Cys 
405 410 415 

Val He Val Thr Pro Ala Leu Glu Ala Trp Leu Val Leu Pro Gly Gly 
420 425 430 

Phe Ala He Lys Ala Arg Tyr Arg Ala Ser Lys Glu Asp Leu Val Phe 



WO 96/15779 



PCT/US95/15138 



226 



435 440 445 

lie Arg Gly Arg Tyr Glv 
450 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 984 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: N 
(iv) ANTI-SENSE: N 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1 . . 984 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

ATG TTT GCT TTG AGC TCG CTC GTG TCC GAG GGT GAC CCG GAG GTG ACC 4 8 

Met Phe Ala Leu Ser Ser Leu Val Ser Glu Gly Asp Pro Glu Val Thr 
1 5 10 15 

AGT AGG TAC GTC AAG GGC GTA CAA CTT GCC CTG GAC CTT AGC GAG AAC 96 
Ser Arg Tyr Val Lys Gly Val Gin Leu Ala Leu Asp Leu Ser Glu Asn 
20 25 30 

ACA CCT GGA CAA TTT AAG TTG ATA GAA ACT CCC CTG AAC AGC TTC CTC 144 
Thr Pro Gly Gin Phe Lys Leu lie Glu Thr Pro Leu Asn Ser Phe Leu 
35 40 45 

TTG GTT TCC AAC GTG ATG CCC GAG GTC CAG CCA ATC TGC AGT GGC CGG 192 
Leu Val Ser Asn Val Met Pro Glu Val Gin Pro lie Cys Ser Glv Arc 
50 55 60 

CCG GCC TTG CGG CCA GAC TTT AGT AAT CTC CAC TTG CCT AGA CTG GAG 24 0 

Pro Ala Leu Arg Pro Asp Phe Ser Asn Leu His Leu Pro Arg Leu Glu 
65 70 75 80 

AAG CTC CAG AGA GTC CTC GGG CAG GGT TTC GGG GCG GCG GGT GAG GAA 28 8 

Lys Leu Gin Arg Val Leu Gly Gin Gly Phe Gly Ala Ala Gly Glu Glu 
85 90 95 

ATC GCA CTG GAC CCG TCT CAC GTA GAA ACA CAC GAA AAG GGC CAG GTG 3 36 

lie Ala Leu Asp Pro Ser His Val Glu Thr His Glu Lys Gly Gin Val 
100 105 HO 

TTC TAC AAC CAC TAT GCT ACC GAG GAG TGG ACG TGG GCT TTG ACT CTG 3 84 

Phe Tyr Asn His Tyr Ala Thr Glu Glu Trp Thr Trp Ala Leu Thr Leu 
115 120 125 

AAT AAG GAT GCG CTC CTT CGG GAG GCT GTA GAT GGC CTG TGT GAC CCC 4 32 

Asn Lys Asp Ala Leu Leu Arg Glu Ala Val Asp Gly Leu Cys Asp Pro 
130 135 140 

GGA ACT TGG AAG GGT CTT CTT CCT GAC GAC CCC CTT CCG TTG CTA TGG 480 
Gly Thr Trp Lys Gly Leu Leu Pro Asp Asp Pro Leu Pro Leu Leu Trp 
145 150 155 160 
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CTG CTG TTC AAC GGA CCC GCC TCT TTT TGT CGG GCC GAC TGT TGC CTG 52 6 

Leu Leu Phe Asn Gly Pro Ala Ser Phe Cys Arg Ala As? Cys Cys Leu 
165 170 175 

TAC AAG CAG CAC TGC GGT TAC CCG GGC CCG GTG CTA CTT CCA GGT CAC 5 76 

Tvr Lys Gin His Cys Gly Tyr Pro Gly Pro Val Leu Leu Pro Gly His 
180 185 190 

ATG TAC GCT CCC AAA CGG GAT CTT TTG TCG TTC GTT AAT CAT GCC CTG 62 4 

Met Tyr Ala Pro Lys Arg Asp Leu Leu Ser Phe Val Asn His Ala Leu 
!95 200 205 

AAG TAC ACC AAG TTT CTA TAC GGA GAT TTT TCC GGG ACA TGG GCG GCG 672 
Lys Tyr Thr Lys Phe Leu Tyr Gly Asp Phe Ser Gly Thr Trp Ala Ala 
210 215 220 

GCT TGC CGC CCG CCA TTC GCT ACT TCT CGG ATA CAA AGG GTA GTG AGT 72 0 

Ala Cys Arg Pro Pro Phe Ala Thr Ser Arg He Gin Arg Val Val Ser 
225 230 235 240 

CAG ATG AAA ATC ATA GAT GCT TCC GAC ACT TAC ATT TCC CAC ACC TGC 76 8 

Gin Met Lys He He Asp Ala Ser Asp Thr Tyr He Ser His Thr Cys 
245 250 255 

CTC TTG TGT CAC ATA TAT CAG CAA AAT AGC ATA ATT GCG GGT CAG GGG 816 
Leu Leu Cys His He Tyr Gin Gin Asn Ser He He Ala Gly Gin Gly 
260 265 270 

ACC CAC GTG GGT GGA ATC CTA CTG TTG AGT GGA AAA GGG ACC CAG TAT 864 
Thr His Val Gly Gly He Leu Leu Leu Ser Gly Lys Gly Thr Gin Tyr 
275 280 285 

ATA ACA GGC AAT GTT CAG ACC CAA AGG TGT CCA ACT ACG GGC GAC TAT 912 
He Thr Gly Asn Val Gin Thr Gin Arg Cys Pro Thr Thr Gly Asp Tyr 
290 295 300 



CTA ATC ATC CCA TCG TAT GAC ATA CCG GCG ATC ATC ACC ATG ATC AAG 
Leu lie He Pro Ser Tyr Asd lie Pro Ala lie lie Thr Met lie Lys 
305 310 315 320 



960 



GAG AAT GGA CTC AAC CAA CTC TAA 9 84 

Glu Asn Gly Leu Asn Gin Leu 
325 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 327 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi ) SEQUENCE DESCRIPTION: SEQ ID NO : 3 5 : 

Met Phe Ala Leu Ser Ser Leu Val Ser Glu Gly Asp Pro Glu Val Thr 
1 5 10 15 

Ser Arg Tyr Val Lys Gly Val Gin Leu Ala Leu Asp Leu Ser Glu Asn 
20 25 30 

Thr Pro Gly Gin Phe Lys Leu lie Glu Thr Pro Leu Asn Ser Phe Leu 
35 40 45 

Leu Val Ser Asn Val Met Pro Glu Val Gin Pro He Cys Ser Gly Arg 
50 55 60 
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Pro Ala Leu Arg Pro Asp Phe Ser Asn Leu His Leu Pro Arg Leu Glu 
65 70 75 80 

Lys Leu Gin Arg Val Leu Gly Gin Gly Phe Gly Ala Ala Gly Glu Glu 
85 90 " 95 

lie Ala Leu Asp Pro Ser His Val Glu Thr His Glu Lys Gly Gin Val 
100 105 no 

Phe Tyr Asn His Tyr Ala Thr Glu Glu Trp Thr Trp Ala Leu Thr Leu 
115 120 125 

Asn Lys Asp Ala Leu Leu Arg Glu Ala Val Asp Gly Leu Cys Asp Pro 
130 135 140 

Gly Thr Trp Lys Gly Leu Leu Pro Asp Asp Pro Leu Pro Leu Leu Trp 
145 150 155 160 

Leu Leu Phe Asn Gly Pro Ala Ser Phe Cys Arg Ala Asp Cys Cys Leu 
165 170 175 

Tyr Lys Gin His Cys Gly Tyr Pro Gly Pro Val Leu Leu Pro Gly His 
1B0 185 190 

Met Tyr Ala Pro Lys Arg Asp Leu Leu Ser Phe Val Asn His Ala Leu 
195 200 205 

Lys Tyr Thr Lys Phe Leu Tyr Gly Asp Phe Ser Gly Thr Trp Ala Ala 
210 215 220 

Ala Cys Arg Pro Pro Phe Ala Thr Ser Arg lie Gin Arg Val Val Ser 
225 230 235 240 

Gin Met Lys lie lie Asp Ala Ser Asp Thr Tyr lie Ser His Thr Cys 
245 250 255 

Leu Leu Cys His lie Tyr Gin Gin Asn Ser He He Ala Gly Gin Glv 
260 265 270 

Thr His Val Gly Gly He Leu Leu Leu Ser Gly Lys Glv Thr Gin Tyr 
275 280 285 

He Thr Gly Asn Val Gin Thr Gin Arg Cys Pro Thr Thr Gly Asp Tvr 
290 295 300 

Leu He He Pro Ser Tyr Asp He Pro Ala He lie Thr Met He Lvs 
305 310 315 320 

Glu Asn Gly Leu Asn Gin Leu 
325 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 330 base pairs 

(B) TYPE: nucleic acid 
(C> STRANDEDNESS : single 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: N 
(iv) ANTI- SENSE: N 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
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GGATCCCTCT 


GACAACCTTC 


AGATAAAAAA 


CGTATATG CC 






c n 

D L' 


CAACACCCAG 


CTAGCAGTGC 


TACCCCCATT 


TTTTAGCCGA 


AAGGATTCCA 


CCA* ToToi_ a 


u 


CGAATCCAAC 


GGATTTGACC 


CCGTGTTCCC 


CATGGTCGTG 


CCGCAGCAAC 


TGGGGCACGC 


180 


TATTCTGCAG 


CAGCTGTTGG 


TGTACCACAT 


CTACTCCAAA 


ATATCGG CCG 


G\j\jC^_ ^- ~- G^jrt 


240 


TGATGTAAAT 


ATGGCGGAAC 


TTGATCTATA 


TACCACCAAT 


GTGTCATTTA 


TGGGGCGCAC 


300 


ATATCGTCTG 


GACGTAGACA 


ACACGGATCC 








330 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 62 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA (genomic) 
(iii) HYPOTHETICAL: N 
(iv) ANTI- SENSE: N 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
GGATCCGCTG GCAGGTGGGC 
CCAGTCCGCG CCGTAGCGCC 
TTAGTCCGGA GAAGATAGGG 
TGCGCACCGG TTGTCGGAGC 
GGGCTCAAAC CTGCCCAGAC 
G CATT CTTTG GAAGTAGTGG 
GCGATGGTGC GCACCGTTTT 
ATCTCGGCCT GCTGTACGTC 
CCCAGGGCCG GTCCGGTGGC 
AACAGGGTGC TGTGAAACAA 
GGACGTGGGT GTATGCTCCG 

(2) INFORMATION FOR S, 

ii) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL : N 

(iv) ANTI - SENSE : N 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 



GCGCACCTCG 


TCGGGTAGCT 


TGGAGACAAA 


CAGCTCCAGG 


60 


TGCAGGTGCC 


TCACCACCGG 


GGCCGGGTCA 


TGCGATCTGT 


120 


CCCTTGGGAA 


GCCGCTGAAC 


CAGCTCCAGG 


GTCTCCAAGA 


180 


TGTCGCGATA 


GAGGTTAGGG 


TAGGTGTCCG 


GTCCGTCCGT 


240 


ACACCACTGT 


CTGCTGGGGG 


ATCATCCTTC 


TCAGGGAGAT 


300 


TAGAGATGGA 


GCAGACTGCC 


AGGGCGTTGC 


AGGAGTGGTG 


360 


TAAGAAACCC 


CCCAGGGTGG 


GGACTCCCGC 


TCCCTGCAGC 


420 


CTTGGCGAAT 


ATGCGACGAA 


ATCGGCTGTG 


CGCACGGGGT 


480 


ATACAGGCCG 


GTGAGGGCCC 


CCTGGGTCTG 


TCCGCCTGGA 


540 


CAGGTTGCAA 


GGCCGCGAAT 


ACCCCTCTGC 


ACGCTGCTGT 


600 


TGGATCC 
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AGCCGAAAGG ATTCCACCAT TGTGCTCGAA TCCAACGGAT TTGACCCCGT GTTCCCCATG 6 0 

GTCGTGCCGC AG CAACTGG G GCACG CTATT C7GCAGCAGC TGTTGGTGTA CCACATCTAC 12 0 

TCCAAAATAT CGGCCGGGGC C C CGGATGAT GTAAATATGG CGGAACTTGA TCTATATACC 18 0 

ACCAATGTGT CATTTATGGG GCGCACATAT CGTCTGGACG TAGACAACAC GGA 23 3 

(2) INFORMATION FOR SEQ ID NO : 3 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 B base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: N 

<iv) -ANTI -SENSE: N 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

GAAATTACCC ACGAGATCGC TTCCCTGCAC ACCGCACTTG GCTACTCATC AGTCATCGCC 60 

CCGGCCCACG "TGGCCGCCAT AACTACAGAC ATGGGAGTAC ATTGT CAGG A CCTCTTTATG 12 0 

ATTTTCCCAG GGGACGCGTA TCAGGACCGC CAGCTGCATG ACTATATCAA AATGAAAGCG 180 

GGCGTGCAAA CCGGCTCACC GGGAAACAGA ATGGATCACG TGGGATACAC TGCTGGGGTT 24 0 

CCTCGCTGCG AGAACCTGCC CGGTTTGAGT CATGG TCAGC TGGCAACCTG CGAGATAATT 3 00 

CCCACG CCGG TCACATCTGA CGTTGCCT 328 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 132 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(iii) HYPOTHETICAL: N 
<iv) ANTI-SENSE: N 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
AACACGTCAT GTGCAGGAGT GACATTGTGC CGCGGAGAAA CTCAGACCGC ATCCCGTAAC 6 0 
CACACTG AG T GGGAAAATCT GCTGGCTATG TTTTCTGTGA TTATCTATGC CTTAGATCAC 12 0 
AACTGTCACC CG 132 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: sinale 



WO 96/15779 



PCT/US95/15U6 



231 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: N 
(iv) ANTI-SENSE: N 

{Xi> SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
AGCCGAAAGG ATTCCACCAT TCCGTGTTGT CTACGTCCAG 4 0 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) .STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: N 
(iv) ANTI-SENSE: N 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:42: 
GAAATTACCC ACGAGATCGC AGGCAACGTC AGATGTGA 38 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: N 
(iv) ANTI- SENSE : N 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: 
AACACGTCAT GTGCAGGAGT GACCGGGTGA CAGTTGTGAT CTAAGG 4 6 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: N 

(iv) ANTI-SENSE: N 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO:44: 
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ACAGGGCTGG TTGCCCAGGG T 



£2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(iii) HYPOTHETICAL: N 

(iv) ANTI-SENSE: N 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:45: 



AG TTG CAAAC CAG AC CTCAG 



20 
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What is rl aimed is 



1. An isolated DNA molecule which is at least 3 0 
nucleotides in length and uniquely defines a 
herpesvirus associated with Kaposi's sarcoma. 

5 

2. The isolated DNA molecule of claim 1, wherein the 
isolated DNA molecule is cDNA. 



3. The isolated DNA molecule of claim 1, wherein the 
10 isolated DNA molecule is genomic DNA. 

4. An isolated RNA molecule which is derived from 
the isolated nucleic acid molecule of claim 1. 

15 5. The isolated DNA molecule of claim 1 which is 

labelled with a detectable marker. 

6. The isolated DNA molecule of claim 5, wherein the 
marker is a radioactive label, or a calorimetric , 

20 a luminescent, or a fluorescent marker. 

7. A replicable vector comprising the isolated DNA 
molecule of claim l. 

25 8. A plasmid, cosmid, \ phage or YAC containing at 

least a portion of the isolated DNA molecule of 
claim 1 . 



30 



9. A host cell containing the vector of claim 7. 

10. The cell of claim 9 which is a eukaryotic cell. 



11. The cell of claim 9 which is a bacterial cell. 



35 



12 . 



An isolated herpesvirus associated with Kaposi's 
sarcoma . 
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13 . A nucleic acid molecule of at least 14 
nucleotides capable of specifically hybridizing 
with the isolated DNA molecule of claim 1 . 

5 

14 . A DNA molecule of claim 13 . 

15. A nucleic acid molecule of at least 14 
nucleotides capable of specifically hybridizing 

10 with a nucleic acid molecule which is 

complementary to the isolated DNA molecule of 
claim 1. 

16. A nucleic acid molecule of claim 15 wherein the 
15 nucleic acid molecule is capable of hybridizing 

with moderate stringency to at least a portion of 
a nucleotide sequence as shown in Figure 3A (SEQ 
ID NO: 1) . 

20 17. An isolated peptide encoded by at least a portion 

of a nucleic acid molecule with a sequence as set 
forth in (SEQ ID NOs : 1-37). 

18. A host cell which expresses the peptide of claim 
25 17 . 

19. The isolated peptide of claim 17, wherein the 
peptide is linked to a second peptide to form a 
fusion protein. 

30 

20. The fusion protein of claim 17, wherein the 
second peptide is beta -galactosidase . 

21. An antibody which specifically binds to the 
35 peptide encoded by the isolated DNA molecule of 

claim 17. 
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22. The antibody of claim 21, wherein the antibody is 
monoclonal antibody. 

23. The antibody of claim 21, wherein the antibody is 
5 a polyclonal antibody. 

24. The antibody of claim 21, wherein the antibody is 
labelled with a detectable marker. 

10 25. The labelled antibody of claim 24, wherein the 

marker is a radioactive label, or a calorimetric , 
a luminescent, or a fluorescent marker. 

26. An antisense molecule capable of hybridizing to 
15 the isolated DNA molecule of claim 1. 

27. The antisense molecule of claim 26, wherein the 
molecule is a DNA. 

20 28. The antisense molecule of claim 26, wherein the 

molecule is a RNA. 

29. A triplex oligonucleotide capable of hybridizing 
with a double stranded isolated DNA molecule of 

25 claim 1. 

30. A transgenic nonhuman mammal which comprises at 
least a portion of the isolated DNA molecule of 
claim 1 introduced into the mammal at an 

30 embryonic stage. 

31. A vaccine which comprises an effective immunizing 
amount of the isolated herpesvirus of claim 12 
and a suitable pharmaceutical carrier. 
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32. A method of diagnosing Kaposi's sarcoma which 
comprises: (a) obtaining a nucleic acid molecule 
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from a tumor lesion of the subject; (b) 
contacting the nucleic acid molecule with the 
labelled nucleic acid molecule of claim 13 under 
hybridizing conditions; and (c) determining the 
presence of the nucleic acid molecule hybridized, 
the presence of which is indicative of Kaposi's 
sarcoma in the subject, thereby diagnosing 
Kaposi's sarcoma. 

The method of claim 32 wherein the DNA molecule 
from the tumor lesion is amplified before step 
(b) . 

A method of diagnosing Kaposi's sarcoma which 
comprises: (a) obtaining a nucleic acid molecule 
from a suitable bodily fluid of a subject; (b) 
contacting the nucleic acid molecule with the 
labelled nucleic acid molecule of claim 13 under 
hybridizing conditions; and (c) determining the 
presence of the nucleic acid molecule 

hybridized, the presence of which is indicative 
of Kaposi's sarcoma in the subject, thereby 
diagnosing Kaposi's sarcoma. 

A method of diagnosing a DNA virus associated 
with Kaposi's sarcoma which comprises (a) 
obtaining a suitable bodily fluid sample from a 
subject, (b) contacting the suitable bodily fluid 
of the subject to a support having already bound 
thereto a Kaposi's sarcoma antibody, so as to 
bind Kaposi's sarcoma antibody to a specific 
Kaposi's sarcoma antigen, (c) removing unbound 
bodily fluid from the support, and (d) 
determining the level of Kaposi's sarcoma 
antibody bound by the Kaposi's sarcoma antigen, 
thereby diagnosing Kaposi's sarcoma. 
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36. A method of diagnosing a DNA virus associated 
with Kaposi's sarcoma which comprises (a) 
obtaining a suitable bodily fluid sample from a 
subject, (b) contacting the suitable bodily fluid 

5 of the subject to a support having already bound 

thereto a Kaposi's sarcoma antigen, so as to bind 
Kaposi's sarcoma antigen to a specific Kaposi's 
sarcoma antibody, (c) removing unbound bodily 
fluid from the support, and (d) determining the 
10 level of the Kaposi's sarcoma antigen bound by 

the Kaposi's sarcoma antibody, thereby diagnosing 
Kaposi's sarcoma. 

37. A method of treating a subject with Kaposi's 
15 sarcoma, comprising administering to the subject 

an effective amount of an antisense molecule of 
claim 26 under conditions such that the antisense 
molecule selectively enters a tumor cell of the 
subject, so as to treat the subject. 

20 

38. A method for treating a subject with Kaposi's 
sarcoma (KS) comprising administering to the 
subject having a human herpesvirus - associated KS 
a pharmaceutically effective amount of an 

25 antiviral agent in a pharmaceutically acceptable 

carrier, wherein the agent is effective to treat 
the subject with KS-associated human herpes virus 
of claim 12 . 



30 3 9. A method of prophylaxis or treatment for Kaposi's 

sarcoma (KS) by administering to a subject at 
risk for KS, an antibody that binds to the human 
herpesvirus of claim 12 in a pharmaceutically 
acceptable carrier. 

35 

40. A method of vaccinating a subject against 
Kaposi's sarcoma, comprising administering to the 
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subject an effective amount of the peptide of 
claim 17, and a suitable acceptable carrier, 
thereby vaccinating the subject. 

5 41. A method of immunizing a subject against a 

disease caused by the herpesvirus associated with 
Kaposi's sarcoma which comprises administering to 
the subject an effective immunizing dose of the 
vaccine of claim 12 . 

10 

42. A method for preventing the development or 
transmission of herpesvirus associated Kaposi's 
sarcoma in a subject by treating a subject with 
Kaposi's sarcoma (KS) comprising administering to 

15 the subject having a human herpesvirus-associated 

KS a pharmaceutical^ effective amount of an 
antiviral agent in a pharmaceutically acceptable 
carrier, wherein the agent is effective to 
preventing the development or transmission of the 

20 KS-associated human herpes virus of claim 12. 
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FIGURE 1 
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FIGURE 3A-1 



SEQ. ID. NO. 1 

7CGAG7CGGA GAG77GGCAG AGG C 77TG AG CTCGCTG7GA CGTTrTCACG G7G77GG77G 6 0 

GGA7CAGC7G G7GAC7CAGA CAAG7777GA G77C7ACAAC G7AACA7ACG GGCTGATGCr 12 0 

CAGCCGA7AC CAGAA77ACG CAG7GGGCAA TT^TGTG"CC 7AGAG7CACC 7CAAAGAA7A 180 

A7C7G7GG7G TCCAAGGGGA GGGTTCTGGG GCCGGC7ACT 7AGAAACCGC CATAGATCGG 24 0 

GCAGGGTGGA GTACTTGAGG AGCCGGCGGT AGGTGGCCAG GTGGGCCCGG 77AGG7GC7C 3 00 

TTTTGCGTGC TGCTGGAAGC C7GG7CAGGG A777CTTAAC CTCGGC77C3 G77GGACG7A 3 60 

CCATGGCAGA AGGCGG7777 GGAGCGGACT CGGTGGGGCG CGGCGGAGAA AAGGCC7C7G 4 20 

TGACTAGGGG AGGCAGG7GG GACTTGGGGA GC7CGGACGA CGAA7CAAGC ACrTCCACAA 4 80 

CGAGCACGGA 7A7GGACGAC C7CC77GAGG AGAGGAAAC Z A77AAGGGGA AAG777G7AA 54 C 

AAACG7CG7A CA7A7ACGAC G7GCCCACCG 7CCCGACCAG CAAG3CG7GG CA777AA7GC 6 00 

ACGACAA77C CC777ACGCA ACGCC7AGG7 77CCGCCCAG ACC7C7rA7A CGGCACCC7T 6 60 

CCGAAAAAGG CAGCA777T7 GCCAG7CGG7 7G7CAGCGAC 7GA7GACGAC 7 "33 A3 ACT 72 0 

ACGCGCCAA7 GGA7CGCT7C GC777CCAGA GC^7CAGGG7 G7G7GG7CGC ZZTZCZTTTC 78 0 

CGCC7CCAAA 7CACCCACC7 CCGGCAAC7A GGCCGGCAGA CGCG7CAA7G GGGGAC37GG 64 0 

G C7GGGCGGA 777GCAGGGA C7CAAGAGGA CC~CAAAGGG A77TT7AAAA AZA777ACCA 900 

AGGGGGGCAG 7C7CAAAGCC CG7GGACGCG A7G7AGG7GA CCG777CAGG GACGGG3GC7 96 0 

7TGCC777AG 7CC7AGGGGC G7GAAA7C73 CCA7AGGGCA AAA £A 77 AAA 7CA7GG77GG 1020 

GGA7CGGAGA A7CA7CGGGG AC7G77G7CC CCG7CACCA" GCA3777A7G G7AZCG37GC 109 0 

ACC7CA77AG AACGG77G7G ACC37GGAC7 ACAGGAA7G7 77A7773777 TA "TAG AGG 114 5 

GGG7AA7GGG 7G7GGGCAAA 7CAACGC7GG 7CAACGCCG7 G73CGGGA7C 77GCCCCAGG 12 0C 
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FIGURE 3A-2 

AGAGAG7GAC AAGTTTTCCC GAGCCCA7GG 7G7AC7GGAC GAGGGCA777 ACAGA77G77 12 eC 

ACAAGGAAAT TTCCCACCTG A7GAAG777G G7AAGGCGGG AGACCCGCTG ACG7CTGCCA 1 3 2 2- 

AAATATACTC A7GCCAAAAC AAG7TTTCGC TCCCCTTCCG GACGAACGCC ACCG77A7C7 13 5 2 

7GGGAA7GA7 GCAGCCCTGG AACG7TGGGG G7GGG7CTGG GAGGGG CAC7 CAC7GG7GCG 14 4 2 

7C777GA7AG GCA7C7CC7C TCCCCAGCAG 7GG7G7TCCC 7C77A7GCAC C7GAAGCACG 15 0 0 

GCCGCC7A7C 7777GA77AC T7C77TCAA7 7ACTTTCZA7 C777AGAGCC ACAGAAGGCG 156 0 

ACG7GG7CGC CA7777CACC C7C7CCAGCG CCGAG7CG77 GCGGCGGG7C AGGGCGAGGG 16 2 0 

GAAGAAAGAA CGACGGGACG G7GGAGCAAA A77ACA7CAG AGAA77GG7G 7GGG777A7C 16 8 0 

ACGC7G7G7A C7G7TCA7GG A7CA7G77GC AG7ACA7CAC 7G7GGAGCAG A7GG7ACAAC 1*740 

7A7GC37ACA AACCACAAA7 A777CGGAAA 7C7GC77CCG CAGCG7GCGC C73GCACACA 180 0 

AGGAGGAAAC 7T7GAAAAAC C77CACGAGC AGAGCA7GC7 A7C7A7GA7C ACCGG7G7AC I86 0 

7GGA7CC7G7 GAGACA7CA7 CC7G7CG7GA 77GAGC777G C7777G777r 77CACAGA3C 192 0 

7GAGAAAA77 ACAA777A7C G7AGCCGACG C GGA7AAG 77 C7ACGA7GA7 G7A7GCGGC7 198 0 

7G7GGA7CGA AA7C7ACAGG CAGA7CC7G7 CCAA7CCGGC 7A77AAACr7 AGGGCCA7CA 2 04 0 

AC7GGCCAGC A77 AG AG AG C CA3777AAAG CAG77AA77A 377AGAGGA3 ACA7GCAGGG 210 : 

7C7AGCC77C 77GGCGGCCC 77G2A7GC7G GCGA7GCA7A 7C377GAZA7 G7GGAGCCAC 2 26 0 

7GGCGCG77G CCGACAACGG CGACGACAA7 AACC7GC7rr G2CACGCA37 77A7CAA7G3 2 22 0 

GAGAACCAAC C7C73CA7AG AAC7GGAA77 CAACGGCA77 AG - .7 7AAA77GGCA 223 0 

AAA777377G AA7G7GA7CA CGGAGCCGGC 7773ACAGAG T7G7GGAC77 C2GCC3AAG7 2 34 0 

CGCCGAGGAC C7CAGGG7AA C7C7GAAAAA GAGGCAAAG7 777777773^ CGAACAAGAG 24 00 
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FIGURE 3A-3 

AGTTGTGA7C 7CTGGAGACG GCCA7CGC7A TACG7GCGAG G7GCCGACG7 CGTC3CAAAC 246: 
TTATAACATC ACCAAGGGCT 77AACTATAG CGCTCTGCCC GGGCACCT7G GCGGATTTGG 2 52: 

GA7CAACGCG CGTCTGG7AC TGGG7GATA7 CTTCGCATCA AAA7GG7CGC 7ATTCGCGAG 2 58C 

GGACACCCCA GAGTATCGGG TGTTTTACCC AATGAATGTC ATGGCC3TCA AG 7777 C GAT 264 0 
A7CCA7TGGC AACAACGAGT CCGGCGTAGC GCTCTATGGA G7GGTGTCGG AAGA777CG7 -~00 
GG7CG7CACG CTCCACAACA GG7CCAAAGA GGCTAACGAG AC3GCGTCCC ATC7TCTG7T 2 76 0 
CGG7C7CCCG GA7TCACTGC CAT CT CTG AA GGGCCATGTC ACC7A7GA7G AACTCACGTT 2S2C 

CGCCCGAAAC GCAAAA7ATG CGC7AGTGGC GATCCTGCCT AAAGATTCTT ACCAGACACT 2 8 80 

CC77ACAGAG AAT7ACACTC G CAT ATTTC7 GAACATGACG GAGTCGACGC CCCTCGAGTT 2 94 0 

CACGCGGACG ATCCAGACCA GGATCGTATC AATCGAGGCC AGGCGCGCC7 GCGCAG C7CA 3 0 DC 

AGAGGCGGCG CCGGACATAT TCTTGGTGTT GTTTCAGATG 7TGG7GGCAC AC777C77GT 3 06 0 

7GCGCGGGGC ATTGCCGAGC ACCGA7TTG7 GGAGG7GGAC TGCG7GTGTC GGCAGTATGC 312 0 

GGAACTG7A7 TTTCTCCGCC GCA7C7CGCG 7CTGTGCATG CCCACGTTCA CCAC7G7CGG 318 0 

G 7 AT AA C CAC ACCACCCTTG GCGC7GTGGC CGCCACACAA A7AGC7CGCG TGTCCGCCAC 324 0 

GAAGT7GGCC AGT7TGCCCC GCTC7TCCCA GGAAACAGTG C7GGCCATGG TCCAGCTTGG 3 300 

CGCCCGTGAT GGCGCCGTCC CT7CCTCCA7 7 C7GG AGGG C A77G C7A7GG TCG7CGAACA 3 36 0 

7A7G7ATACC GCCTACAC77 ATGTG7ACAC ACTCGGCGAT AC7GAAAGAA AA77AA7GT7 342 0 

GGACA7ACAC ACGGTCCTCA CCGACAGCTG CCCGCCCAAA GACTCCGGAG 7A7CAGAAAA 3480 

GC7ACTGAGA AGATATTTGA 7G7TCACA7C AA7G7G7ACC AACA7AG AG C 7GGGCGAAA7 3 54 0 

GATCGCCCGC TTTT CC AAA C CGGACAG CCT TAACATCTAT AGGG C ATT 27 CCCCCTGC7T 36 00 

TCTAGGACTA AGGTACGATT TGCATCCAGC CAAGTTG CGC GCCGAGGCGC CGCA3TCGT2 3 66 0 

CGCTCTGACG CGGACTGCCG TTGCCAGAGG AACA7CGGGA 77 Z G Z AGAA7 7GC7CCACGC 372 0 

GC7GCACCTC G A7 AG C77AA A777AA77CC GGCGA77AAC 7G77CAAAGA 77ACAGCCGA 3 78 0 

CAAGATAATA GCTACGGTAC CC77GCCTCA CG7CACG7A7 A7CA7CAG77 CCGAAGCAC7 3 84 0 

C7CGAACGC7 G77G7C7ACG AGG7G7CGGA GA7C772C7C AAGAG7GCCA 7G777A7A7C 3 900 

TGCTATCAAA CCCGATTGCT CCGGCTTTAA CTTTTC7CAG A77GATAGG2 ACATTCCCAT 3 96 0 

AGTCTACAAC AT C AG CACA C CAAGAAGAGG TTGCCCCCTT TGTGArTCTG 7AATCATGAG 4C2 0 

CTACGATGAG AGCGATGG CC TGCAGTCTCT CATGTATGTC ACTAATGAAA GGGTG2AGAC 4 08 0 

CAACCTCTTT TTAGATAAGT CA2CT77CTT 7GA7AA7AAC AACC7ACACA 772ATTAT77 4 14 C 

G7GGC7GAGG GACAACGGGA CCG7AG7GGA GATAAGGGG2 A7G7ATAGAA GACGCGCAGC 4 ICC 

CAGTGCT77G T77CTAA77C T CT CTTTT AT TGGGTTCTCG GGGGTTATCT A A 426 C 

CAGACTGTTT TCCATCCTTT ATTAGACGGT CAATAAAGCG 7AGA77777A AAAGG77TC2 43 2 0 

7G7GCATTCT TTT7G 7ATGG GCA7A7AC77 GGGAAGAAAT CCGAGCACC7 CAGAAAG7GG 4 3 80 

ATTGCCGTCA CATATCAGTT CGACCACCCC TGCACCTAGC CATGCGGCGC TTTGACGGTC 4 4 4 C 

777GGGG C7 A CACA7CA7AA AGTACTTTTC CATGGCTTCT ATAAGCACCT TGGAACAATC 4S0C 
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FIGURE 3A-4 

T3GGGG7TGG CGAA7GGG77 CCCTAAACGG GAAA7CC737 A7GG7A7TCA GGZAGAAGAC 4 56 C 

CGCG7CC7CC ACCCGACG77 7GAG7C77TC TAG CAG AG C G C7GAAGAAC7 CCCGC7C3TG 462: 

TGTTTTCGCA GGGGCAAGTT C7GCGCCG7A CAGCGA7GAG AAACACGAGA CGA7G777TC 46 80 

CAGCCCCA7G CTGCGCAGCA ACACG7GCT7 CAGGAACAGG TG77G7AGCC GG777AGTT7 4 74 0 

7AGC7TGGG7 AGAAAAGTTA TCGAGTTGTT AGCACGCTCC A7GA7GG7AA CGGTG7TGAA 4 8 00 

G7CACAGACC GGGCTT7CTC CGAG7C7GGG CCGCCTGAG7 CGAA7CA7G7 AGAACA7AGA 4 86C 

CGCGGCC7CG 77G7C7G7G7 7AAGTGACAC GA7A7CCCG7 77GCAAAC77 G7GGGA7G77 4S2C 

G7G777CAG7 A7AGA7C7GG 7CTGACC3GC ACGGGG7G77 A7GGGG7GAC GCGG7AAAGG 4 98 0 

CGAC737GGG .TGAAAGAC3T 77A7GCGG77 GGCGGCCTCG 7GGA7GACGA CACGC77G77 5 04 0 

CGCGGCG7G7 A7GGGGACGC GACGGCA7CC CGCTGGCAGA TGTA7AA7C7 7AAAG77GGT 513 0 

A7AAGAC7GG 7CGC7CG77A TGGCCAGCCG GCAC7CCGG7 AG7A7C7GCG 7G7777CGAA 516 0 

7TCGTGGCCG CG7ACGAC7G G C7TGGAG7G CAGG7AAACG CCAAGAGA7G CGG777C77C 52 2 0 

GCG7ACGCAC AAG7GGC7TC 7TAACGCG7A GGGGTGCGG7 GAGAG CATGA 7CC37AGCAA 52 8 0 

CGATAG77CC GGG7GCC7AG CCGCG7AGAG 7GGCAGGG7A GACGAG7CCG GA37CCGAAA 5 34 0 

377T7CGAAC AACAG7GGCA 7CGGGAC777 AG GA 77 AG AG A77C37ACCA 7GGCCGZCAC 54 0 0 

CGCCGGAGAG G7CAAGACG7 GAAACACGCG CTCGCC7G7C GACAGGCGCG ZZZZZZZZZZ 54 6 0 

TAG7AGAC7A GC777CACG7 C7GGAAC7CG 7AACA7AGC7 7AGAC2AGCG GACGGACGCA 5520 

ACG7ACGCGG GGA7CGGCTG GCGG7G7C73 C7CG77GGAC G~GG"G77C 3G7GGCGCCA 5 5 30 

G7GCAGGCG7 AG777GCGAA 7GG3G73ACG GAGAA77TG7 GG 7777 AG AG CGGGGAACCG 564 0 

A7GACCCG7G G7GGCGACGA ACGAAA7GAA G777GCA77G CGGCCGAAC7 CG777AGCC7 5 7 00 

GG7C77CT7G 7777GGGCA7 A3A7T7TC3G GA77AGG77A GA777777A7 A7CC7A37AC 5 76 0 

7GCGCAC7rG 7G777GG777 7AG7373AT7 GA77A7777r 7773AGAA3T CAAACAGGTT 582 0 

GCGGGCGGCG G77CGCC7AA 73CAAGCGA:: G7GAA3C77G AGAAACGAAC AGCA773CAC 5 8 60 

CAGACA37C3 AGGAACC7TT 7G7G7AGCG7 77G7A77TGG GAAZGG777C 73T3rTCAAG 5 94 0 

7AGGGAGAA7 ATT 37 A. 7G777CCG7C GA7GCGCGCG 7GC7GG7CCG 7GAGAA7GGG 6 0 00 

CGCCAGCTCG 7GG3GAAT77 G7TGCACAAG AGG77GGCC3 7A GA 37T7AG AAA7C373G3 6O6 0 

7G7CGCGGCC 77AAAC CAGG ACAGG7T7AG CCCA7C377G C7GGAGAC3A CAGA73CAAA 6 12 0 

G777G7GG7C CAAAA7ACG7 7777TGGCC- GA7TC7CACC ATG7AT7G37 T77CCAG7rr 618 0 

G7GCAGG7CC AACG7GGAG7 733AA777GC 7A7CGA7ACA GGAAA7A7G7 Grr7GA77GG 6 24 0 

CAGAAAGGA7 77CAGCG7AC CGA77333AA G A3AAAG7G Z AG2A7G7CCC CAT73A7G77 6 30 0 

GA7G7T7A77 GGGSTGCTTT GACACA7G77 G7CGGAAAAA AACACG777A 7GG7AAAAGA 6 36C 

AGG7TCC7T7 AC3GAGTAC7 77GG7A7AAC AAAA77G77G G7GAATCTGG GGATGTT7AA 6420 

AA7AG77777 73 3AGGG7G7 7AGGAACG7G GCA3CT7A7C 77AG7377AA 73ArGAT377 £4 8 0 

GG7G77GAA7 A73G7GA777 7GAA377773 GAAA77GACG 7G777737GG G7702AGCA7 £54 0 

G7C7GACACT G7AGA3C7GC CCAGAG7GCG CG3GTGC37G ZCZZZZZkTC GTTGGAA3GA 6600 
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FIGURE 3A-5 

CGCC7GCAAA TTTC J L 1 ICA TGGCTGCTCG GGGGTCTTTC GGCGCG7ACC GGA7TGTTGA iS€Z 

AAGCGTCGCC GCCAGGAGAC GGGGTGTGTC GTGGGTGGGT AAAAAGTTTG CGCAGGGG7G e 7 2 2 

CAGTCC3CTG CACGAGTGGC CGATGCAGTC TGCCACTGCC ATA GA CATGA CGAGTGTGTA 6 78C 

GATGGCCGG7 GTGCCCGGAT ACACTAGATA GTAGG7ACAA' 7CTGGGG7AC 7GACGAGCAC 6 64 C 

CCTGTATGGC TTTGGTCCGG GG7CCTTGCG TTGGATTTTT ACGTGCAGAG GGGACAGGAG 6 9CC 

CTGG 7TTAG A GCCAGCTGAA AGCCZACCAG A7CGGG7CG3 7TAAGGTTGA CGTCC7GG7G €96 0 

CTTACTCTGT 7TCGACAGGT TCTTCAGCAC GGTGGGCAGT CGGTCTACG7 7G7GAGGGA7 7C2C 

GGCACGGCGC AG C GAGAC C A GCTCTCCG7G CCACCCCCAC G7GGCCA7GA AGC7GG7GA7 70S: 

GTTAAAGTTT AAAAAATG7A GCTG7GCG7C TGGGGATGCG GGTGGCA7TA 7TGAAAACGA 7i 4 0 

GAGATGCTTG AGGCTC7GGA GGAG7GCAAA A7AA77TTGA TAGA77G7GG G7TGTAGAC7 ~ZZC 

ATGGGGCAAC ACCGCCAGAA ACGCATGAAA ACAC7G7TCG AAC7CCCAGA ACTCGAGGTA "2 6 0 

CCTGCACAC7 A7CC7GAACA TGGCTTTGTA A CATATGGTG CACGTTAGTA GCGCGGGAAG 73 2 0 

AT A GAG CGAG CGTAGCTGGC TGAATTCGCA GGGTT7A7CA CAA7CA7CGG 7AAG7TCCGA 73 SO 

TGATCCCACG GCAGGTAGGT AG77G7CGG7 G7CTATCTGT CCGCGCG7AA ACACTCCACG 74 4 0 

ACCGTCAATT ATTAAACCTT CGCCGCTGTA CGGTCGACCC AGTTTTGGGA AAAGAGTCGC 7 500 

7TCTTGA7GT ATAAAAGGGT GGAGGCGTTC CGGGAGGAGT AGTGTGGGTA TCGC7C7GCA 7 56 0 

GGCGAAAAAG G7GGGCTCGG GCTGCA7CA7 C77A7CAAGA CCTTCTAAGG TGAGGTGTGC 762 0 

CTGCAGGTGG GAGTTGG7GG C GAGA C AG C A G AATATTTC C AGCTGTGATT CGGAAGTCGC 76 BO 

TTGATAACAC GTGGTGTGCG GAG7CGTCG7 CAGGGAGGCG CT-GGTGGCA GTAGTAGGGG 774 0 

GCCCTCGAGC GCTGCGATGG AGGCGACGTT GGAGCAACGA CrTT7GCCG7 AGGTCGCCAG 7 8 0 C 

GGAGGCGAAC CTCCTAACGC AG A77AAGGA GTCGGGTGCC GACGGAGTC7 TCAAGAGGT7 76 6 0 

TCAGCTA77G C7CGGCAAGG ACGCCAGAGA AGGCAGTGTC CGTTTGGAAG CGCTAGTGGG 792 0 

GGTA7A7ACC AATGTGGTGG AGTTTGTTAA G7TTGTGGAG AGGGGGGTGG CCGCGGCTTG 7 9B 0 

CGTCAATACC GAGTTGAAGG ACC7GCGGAG AATGATAGAT GGAAAAATAC AGTTTAAAAT 8 04 0 

TTCAA7GCCC ACTATTGCCG A G GG AGACGG GAGGAGGCGC AACAAGCAGA GAGAGTATAT 8100 

CGTCATGAAG GCTTGCAATA AGCACCACAT CGGTG CGGAG ATTGAGGTTG CGGCGSCAGA 816C 

CATCGAGCTT CTCTTCGGGG AGAAAGAGAC GCCGTTGGAC 7TGAGAGAG7 AGGGGGGTGC 6 22 0 

CATCAAGACG A77ACG7CGG CTTTGCAGTT TGGTATGGAC GCCG7AGAAG GGGGGGTAG7 32 8 0 

GGACACGGTT CTCGCAGTTA AAGTTCGGGA CGGTCCAGGC GTCTTTATTT TAAAGACGGT 8 34 0 

GGGCGATCCC GTCTACTCTG AGAGGGGCGT CAAAAAGGGC GTCAAGTGTG ACATGGTATC 84 0 0 

CATG77 GAAG GCACACCTCA 7AGAACA77C A777TTTGTA GATAAGGGGG AGCTGA7GAC 84 6 0 

AAGGGGGAAG CAGTA7GTGC TAACGATGCT CTCCGAGATG CTGGGCGGGG TGTG2GAGGA 852: 

TACCGTC7TT AAGOGTGTCA GCACGTACAC CACGGCGTCT GGG2AGCAGG TGGGGGGCGT 8 SBC 

CCTGGAGACG ACGGAGAGCG TCA7GAGACG GCTGATGAAC GTGGTGGGGC AAGTGGAAA3 864 C 

7GCCA7GTGG GGGGGGGCGG CCTACGCCAG CTAGGTTGTC AGGGGTGGGA AGGTOGTCAG 87CI 
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FIGURE 3A-6 

CGCCGTTAGC TACGGAAGGG CG ATG AG AAA CTTTGAACAG TTTATGGCAC GCATAGTGGA 6*60 

CCATCCCAAC GCTCTGCCGT CTGTGGAAGG TGACAAGGCC GCTCTGGCGG ACGGACACGA 8 82 0 

CGAGATTCAG AGAACCCGCA TCGCCGCCTC TCTCGTCAAG ATAGGGGATA AGTTT3TGGC 5 38 : 

CATTGAAAGT T7GCAGCGCA TGTACAACGA GACTCAGTTT CCCTGCCCAC TGAACCGGCG 9 94C 

CATCCAGTAC ACCTATTTCT TCCCTG7TGG CCTTCACCTT CCCGTGCCCC GCTACTCGAC 9300 

ATCCGTCTCA GTCAGGGGCG TAGAATCCCC GGCCATCCAG TCGACCGAGA CGTGGGTGGT 906 C 

TAATAAAAAC AACGTGCCTC TTTGCTTCGG TTACCAAAAC GCCCTCAAAA GCATATGCCA 912 C 

CCCTCGAATG CACAACCCCA CCCAGTCAGC CCAGGCACTA AACCAAGCTT TTCCCGATCC 3 ISC 

CGACGGGGGA CATGGGTACG GTCTCAGGTA TGAGCAGACG CCAAACATGA A C CT ATT CAG 924 0 

AACGTTCCAC CAG T ATT ACA TGGGGAAAAA CGTGGCATTT GTTCCCGATG TGGCCCAAAA 93 0 0 

AGCGCTCGTA ACCACGGAGG ATCTACTGCA CCCAACCTCT CACCGTCTCC TCAGATTGGA 93 6 0 

GGTCCACCCC TTCTTTGATT TTTTTGTGCA CCCCTGTCCT GGAGCGAGAG GATCGTACCG 94 2 0 

CGCCACCCAC AGAACAATGG TTGGAAATAT ACCACAACCG CTCGCTCCAA GGGAGTTTCA 94 8 0 

GGAAAGTAGA GGGGCGCAGT TCGACGCTGT GACGAATATG ACACACGTCA TAGACCAGCT 954 0 

AAGTATTGAC G7c: ATACAGG AGACGGCATT TGACCCCGCG TATCCCCTGT TCT3CTATGT 9S00 

AATCGAAGCA ATGATTCACG GACAGGAAGA AAAATTCGTG ATGAACATGC CCCTCATTGC 96 6 0 

CCTGGTCATT C AAA C CT ACT GGGTCAACTC GGGAAAACTG GCGTTTGTGA ACAGTTATCA 972 C 

CATGGTTAGA TTCATCTGTA CGCATATTGG GAATGGAAGC ATCCCTAAGG AGGCGCACGG 978 0 

CCACTACCGG AAAATCTTAG GCGAGCTCAT CGCCCTTGAG CAGGCGCTTC TCAAGCTCGC 984 0 

GGGACACGAG ACGGTGGGTC GGACGCCGAT CACACATCTG GTTTCGGC7C TCCTCGACCC 9 9 00 

GCATCTGCTG CCTCCCTTTG CCTACCACGA TGTCTTTACG GATCTTATGC AGAAGTCATC 9 96 0 

CAGACAACCC ATAATCAAGA TCGGGGATCA AAACTACGAC AACCTTCAAA ATAGGGCGAC 1002 0 

ATT CAT CAA C CTCAGGGGTC GCATGGAGGA CCTAGTCAAT AACCTTGTTA ACATTTACCA 10080 

GACAAGGGTC AATGAGGACC ATGACGAGAG ACACGTCCTG GACGTGGCGC CCCTGGACGA 1014 0 

GAATGACTAC AACCCGGTCC TCGAGAAGCT ATT CTACTAT GTTTTAATGC CGGTGTGCAG 12200 

TAACGGCCAC ATGTGCGGTA TGGGGGTCGA CTATCAAAAC GTGGCCCTGA CGCTGACTTA 102 6 0 

CAACGGCCCC GTCTTTGCGG ACGTCGTGAA CGCACAGGAT GATATTCTAC TGCACCTGGA 1C320 

GAACGGAACC TTGAAGGACA TTCTGCAGGC AGGCGACA7A CGCCCGACGG TGGACATGAT 1=3 80 

CAGGGTGCTG TGCACCTCGT TTCTGACGTG CCCTTTCGTC ACCCAGGCCG CTCGCGTGAT 10440 

CACAAAGCGG GACCCGGCCC AGAGTTTTGC CACGCACGAA TACGGGAAGG ATGTGGC3CA 105 00 

GACCGTGCTT GTTAATGGCT TTGGTGCGTT CGCGGTGGCG GACCGCTCTC GC3A3GC3GC i:-56C 

GGAGACTATG TTTTATCC3G TACCCTTTAA CAAGCTCTAC GCTGACCC3T T3GT3GCTGC i:©2C 

CACACTG CAT CCGCTCCTGC CAAACTATGT CACCAGGCTC CCCAACCAGA GAAAC3CGGT 10680 

GGTCTTTAAC GTGCCATCCA ATCTCATGGC AGAATATGAG GAA7GGCACA AGTCGCCCGT 121 *Z 

CGCGGCGTAT GCCGCGTCTT GTCAGGCCAC CCCGGGCGCC ATTAGCGCCA T3GT3A3CA7 10800 
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FIGURE 3A-7 

G C AC C AAAAA CTA7CTGCCC CCAG777CA7 7TGCCAGGCA AAACA7CG CA 7GCAC~C73G 108 & 2 
7T77GCCA7G ACAGTCGTCA GGACGGACGA GG7TC7AGCA GAGCACA7CC TATArTG™" 1092 0 
CAGGGCGTCG ACA7CCA7G7 77G7GGGC7T GCCT7CGG7G G7ACGG-3CG AGG7ACG7TC 10 980 
GGACGCGGTG ACTTTTGAAA TTACCCACGA GA7CGCTTCC C7GCACACCG CACTTGGCTA 1 1 04 0 
CTCATCAGTC ATCGCCCCGG CCCACGTGGC CGCCA7AACT ACAGACA7GG GAGTACATTG 1110 0 
7CAGGACC7C 77TA7GA7T7 7CCCAGGGGA CGCG7A7CAG GACCGCCAGC 7GCA7GACTA 1116 C 
7A7CAAAA7G AAAGCGGGCG 7GCAAACCGG C7CACCGGGA AACAGAA7GG A7CACG7GGG 1122 C 
ATACAC7GC7 GGGG77CCTC GCTGCGAGAA CC7GCCCGG7 7TGAG7CATG G7CAGC7GGC 112 BE 
AACC7GCGAG A7AA7TCCCA CGCCGG7CAC A7CTGACG77 GCCTA77TCC AGACCCCCAG 1134 0 
CAACCCCCGG GGGCG7GCGG CG7CGGTCG7 G7CG7G7GA7 GC77ACAG7A ACGAAAGCGC 114 00 
AGAGCG777G 77C7ACGACC A7TCAA7ACC AGACCCCGCG 7ACGAA7GCC GG7CCACCAA 1146 0 
CAACCCG7GG GC7TCGCAGC G7GGCTCCC7 CGGCGACG7G C7A7ACAA7A TCACCTT7CG 11520 
CCAGAC7GCG C7GCCGGGCA 7G7ACAG7CC 7TG7CGGCAG 77C7TCZACA AGGAAGACA7 115 80 
7A7GCGG7AC AATAGGGGG7 7G7ACAC777 GG7TAA7GAG 7A77C7GCCA GGC7TGC7GG 1164 0 
GGCCCCCGCC ACCAGCAC7A CAGAC77CCA G7ACG7CG7G G7CAACGG7A CAGACG7G77 11700 
TTTGGACCAG CCTTGCCA7A 7GCTGCAGGA GGCC7A7CCC ACGC7CGCCG CCAGCCACAG 11760 
AG77A7GC77 GCCGAG7ACA 7G7CAAACAA GCAGACACAC GCCZ CAG7AC ACA7GGGCCA 1182 0 
G7A7CTCA77 GAAGAGG7GG CGCCGA7GAA GAGAC7A77A AAGC7CGGAA ACAAGG7GGT 11880 
GTA77AGC7A ACCC77C7AG CG77GGC7AG 7CA7GGCAC7 CGACAAGAG7 A7AG7GG77A 11940 
ACTTCACCTC CAGACTCTTC GC7GA7GAAC 7GGCCGCCCT TCAG7CAAAA A7AGGGAGCG 12 0C0 
7AC7GCCGC7 CGGAGATTGC CACCG777AC AAAA7A7ACA GGCATTGGGC C7GGGG7GCG 12 060 
7A7GC7CACG T GAG AC AT CT CCGGACTACA 7CCAAA77A7 GCAG7A7C7A 7CCAAG7GCA 1212 0 
CAC7CGC737 CCTGGAGGAG GTTC3CCCGG ACAGCCTGCG CC7AACGCGG A7GGATCCC7 1218 0 

CTGACAACCT 7CAGA7AAAA AACG7A7A7G CCCCC77777 7CA37GGGAC AGCAACACCC 122 4 0 

AGC7AGCAG7 GCTACCCCCA 7TTTTTAGCC 3AAAGGA77C CACCA7TG7G C7CGAA7CCA 123 00 

ACGGA7T7GA CCCCG7G77C CCCATGG7CG 7GCCGCAGCA AC7GGGGCAC G77A77C7GC 12360 

AGCAGC7G77 GG7G7ACCAC A7C7A77CCA AAA7A7CGGC CGGGGCCCCG GA73A7G7AA 124 2 0 

A7A7GG CGG A ACTTGATCTA 7ATAC C AC CA ATG7G7CA77 7A7GGGGCGC ACA7A7CG7C 124 BO 

7GGACG7AGA CAACACGGA7 CCACGTAC7G CCCTGCGAGT GCT7GAC3A7 C7G7CCA7G7 12 540 

ACCT77G7A7 C C7ATCAG CC 7TGG77CCCA GGGGG7G7CT CCGTC7GC7C ACGGCGC7CG 126 CC 

7GCGGCACGA CAGGCA7CC7 C7GACAGAGG TG7TTGAGGG GG7GG7GCCA GA7GAGG7GA 1266 0 

CCAGGA7AGA 7C7CGACCAG 77GAGCG7CC CAGA7GACA7 CACCAGGATG CGC3TCA7G7 1272 0 

7C7CC7A777 7CAGAG7C7C AG77CTA7A7 77AA7CTTGG "CCAGA773 CACG7G7A7G 1 27 SO 

CCTACTCGGC AGAGAC777G GCGGCCTCC7 G77GG7A77C CCCACGCTAA C3A777GAAG 1234C 

CGGGGGGGG7 A7GGCG7CA7 C7GA7A77C7 G7CGG77GCA AGGACGGA7G ACGGCTCCG7 12 9:0 
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FIGURE 3A-8 



CTGTGAAGTC TCCCTGCGTG GAGG7AGGAA AAAAACTACC G7C7ACCTGC CGGACAC7GA 12 96C 

ACCCTGGGTG G7AGAGACCG ACGCCATCAA AGACGCCTTC CTCAGCGACG GGATCGTGGA 13 Ci: 

TATGG CTCGA AAGCTTCATC GTGGTGCCCT GCCCTCAAAT 7C7CACAACG GCTTGAG3AT 13 08C 

GGTGC7TT7T 7GTTA7TG7T ACTTGCAAAA TTGTGTGTAC CTAGCCCTGT 77C7GTGCCC 1314C 

CnTAATCCT 7ACT7GG7AA CTCCC7CAAG CATTGAGTTT GCCGAGCCCG TTG7GGCACC 13 2CC 

7GAGGTGCTC T7CCCACACC CGGCTGAGA7 G7C7CGCGG7 7GCGA7GACG CGATTTTC7G 13 26C 

7AAAC7GCCC TA7ACCG7GC C7A7AA7CAA CACCACG77T GGACGCATTT ACCCGAACTC 13 32C 

7ACACGCGAG CCGGACGGCA GGCCTACGGA 77AC7CCA7G GCCC77AGAA GGGC7TTTGC 13 3 80 

AG7TA7GG77 AACACG7CA7 G7GCAGGAG7 GACAT7G7GC CG CGG AG AAA C7CAGACCGC 13 440 

ATCCCG7AAC CACAC7GAG7 GGGAAAA7C7 GCTGGCTA7G TTTTCTGTGA TTATC7A7GC 13 500 

C77AGA7CAC AACTGTCACC CGGAAGCACT G7C7A7CGCG AGCGGCATCT 77GACGAGCG 13 560 

7GAC7A7GGA 77 ATT CAT C7 C7CAGCCCCG GAGCG7GCCC 7CGCC7ACCC CTTGCGACGT 13.6 2 0 

GTCGTGGGAA GATATCTACA ACGGGAC7TA CC7AGC7CGG CC7GGAAAC7 G7GACCCCTG 1368 0 

GCCCAA7CTA TCCACCCC7C CC7TGA7TC7 AAATTTTAAA TAAAGGTGTG TCACTGGTTA 13 7 4 0 

CACCACGATT AAAAACCACT CAC7GAGA7G 7CTTTTTAAC CGCTAAGGGA TTATACCGGG 13 800 

ATTTAAAACC GCCCACTGAT 77T777ACGC TAAGAGTTGG GTGC77GGGG GG7TT7GCA7. 13 860 

TGCTCTGT7G 7 AAA CT AT AT ATAAGTTAAA CCAAAAT7CG CAGGGAGACA AGGTGACGGT 13 920 

GGTGAGAACT CAGTTGAGAG TCAGAGAATA CAGTG CTAAT CAGGGTAGAT GAGCA7GACT 13 9B0 

TTCCCCGTCT CCAGTCACCG GAGGAATGG7 GGACGGCTCC G7C77GG7GC GAATGGCCAC 14 040 

CAAGCCTCCC GTGATTGGTC TTATAACAGT GCTCTTCCTC CTAGT CATAG GCGCCTGCGT 14100 

CTACTGCTGC ATTCGCGTGT 7CCTGGCGGC 7CGAC7G7GG CAC7AGGCAG 14 160 

GGCCACCGTG GCGTATCAGG TCCTTCGCAC CCTGGGACCG CAGG CCGGG7 CACATGCACC 14 22 0 

GCCGACGGTG GG CAT AG CTA CCCAGGAGCC CTACCGTACA ATATACATGC CAGATTAGAA 14 2 80 

CGGGGTGTGT GCTATAATGG ATGGCTATGG GGGGGGGCTG TAGATAATTG AGCGCTGTGC 14 340 

TTTTATTG7G GGGATATGGG CT7G7ACA7G 7G7CTATCAT CGGTAGCCAT AAAATGGGCC 14 4 00 

ATGACAACTG CCACAAGTAA GTCGTCCGAC ATGTG CT777 GCTTGGCGCT GTATGACTGC 14460 

CCTCCA7CCC TAAGCGGGAC GCACTTGATC GCGCGGACCT GTTCTACCAG GTAGGTCACC 14 52C 

GGGTCAAATG ATATTTTGAT GGTG7TGGAC ACCACCGTCT GGCTGGCGCT CAGGGTGCCG 14 580 

GAGTTCAGAG CGTAGATGAA TGTCTCAAAC GCGGAGGATT TCTCGCCTCC CAACATGTAA 1464 0 

ATTGGCCACT GCAGGCCGCT GCTCTTGTCA GTATAGTGTA GAAAATGTAT GGGGAGCGGG 14 700 

CATATTTCGT TAAGGACGGT TGCAATGGCC ACCCCAGAAT CTTGGCTGCT GTTGCCTTC3 14 76 0 

ACC3CCGCGT TCACGCGCTC AATTGTGGTG TGGAGCACAG CGATCGCCTT AATCATCGTS 14 820 

CATGCGCAGG ACGCTATCTC G7AAGCAGCT GCGCCAC7GA GG7CGCGCAG GAAGAAATGC 14880 

7CCATGCCCA ATATGAGGCT TCTGGTGGGA GTCTGAGTAC TCGTGACAAC GGCGCCCACG 14 94 0 

CCAGTACCGG ACGCCTCCG7 GTTGTTCG7A 7 AC G CGGGG T CGATGTAAAC AAACAGCTGT ISOO: 

SUBSTITUTE SHEET (RULE 26) 
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FIGURE 3A-9 

7TTCCAAGGC AC7TCTGAAC CTCTTGGGCG GTGGTG7C7A CCCGACACAT G7CAAACTGT 15 060 

GTCAGCGCTG CG7CACCCAC CACGCGGTAA AG C GT AG CAT TTGACGACGC TGCTCCCTCG 1512 0 

C C CATTAG7T CGGTGTCGAA TGCCCCCTCC A7AAAGAGG7 TGGTGGTGGT TTTGATGGAT -5180 

TCGTCGATGG TGATGTACGT CGGAATG7GC AGTCTGTAAC AAGG A CAGGA CACTAGTGC3 15 240 

TC7TGCAGGT GGAAA7CTTC 7CGG7GG7CC GCACACACGT AACTGACCAC ATT C AG CATC 15300 

TTTTCCTGGG CG7TCCTGAG GTTAAGCAGG AAACTCGTGG AGCGGTCTGA CGAGTTCAC3 15 360 

GATGATATAA ATATAAGCTT GGCGTCTTTC TGAAGCATGA AACCCAGAAT AGCCGGCAGT 15420 

GCATCCTTTT TAATAAAATT CGCCTCGTCT ACGTAGAGCA GGTTAAAGGT CTGTCCCCGA 154 8 0 

ATGCTCTGCA GACACGGAAA GACACAAAAG AGGGGCTCAT AAGCGGCTAA CAGTAAAGGA 15 54 0 

GAGGAGGCGA ACAGTGCGTG GCTCTTGGTT CTTGGGAATA AAAGGGGGCG TGTGTGCCGA 15 600 

TCGATCGTAT GGGTGAGCCA GTGGA7CCTG GACATGTGGT GAATGAGAAA GATTTTGAGG 15660 

AGTGTGAACA ATTTTTCAGT CAACCCCTTA GGG AG CAAGT GGTCGCGGGG GTCAGGGCAC 15 720 

TCGACGGCCT CGGTCTCGCT GACTCTCTAT GTCACAAAAC AGAAAGACTC TGCCTGCTGA 15 780 

TGGACCTGGT GGGCACGGAG TGCTTTGCGA GGGTGTGCCG CCTAGACACC GG7GCGAAA7 15 840 

GAAGAG7GTG GCGAGTCCCT TATGTCAGTT CCACGGCGTG TTTTGCCTGT ACCAGTGTCG 15 900 

CCAGTGCCTG GCATACCACG TGTGTGATGG GGGCGCCGAA TGCGTTCTCC 7GCATACGCC 15 960 

GGAGAGCGTC ATCTGCGAAC TAACGGGTAA CTGCA7GCTC GGCAACATTC AAGAGGGCCA 16 02 0 

GTTTTTAGGG CCGG7ACCG7 ATCGGACTTT GGA7AACCAG GTTGACAGGG ACGCATA7CA 16 0BO 

CGGGA7GC7A GCGTG7CTGA AACGGGA CAT TGTGCGGTAT TTG CAGA CAT GGCCGGACAC 1614 0 

CACCGTAATC G7G CAGGAAA 7AGCCCTGGG GGACGGCG7C ACCGACACCA TCTCGGCCAT 162 00 

TATAGATGAA ACATTCGGTG AGTGTCTTCC CGTACTGGGG GAGGCCCAAG GCGGGTACGC 16 2S0 

CCTGGTCTGT AGCATGTATC TGCACGTTAT CGTCTCCATC TATTCGACAA AAAC3GTGTA 16 32 0 

CAACAGTATG CTATTTAAAT GCACAAAGAA TAAAAAGTAC GACTGCATTG CCAAGCGGGT 1638 0 

GCGGACAAAA TGGATGCGCA TGCTATCAAC GAAAGATACG TAGGTCCTCG CTGCCACCGT 1644C 

TTGGCCCACG TGGTGCTGCC TAGGACCTTT CTGCTGCATC AC3CCATACC CCTGGAGCCC 16 50C 

GAGA7CA7CT TTTCCACCTA CACCCGGTTC AGZZGGTCZZ CAGGGTCATC CCGCCGGTTG 16S6C 

GTGG7GTG7G GGAAACG7GT CCTGCCAGGG 3AGGAAAACC AACTTGCGTC TTCACCTTCT 1662 0 

GGTT7GGCGC TTAGCCTGCC TCTGTTTTCC CACGATGGGA ACTTTCATCC ATTT3ACATC 166 8 0 

TCGGTACTGC GCATTTCCTG CCCTGGTTCT AATCTTAGTC TTACTGTCAG ATTTCTCTAT 1674C 

C7ATC7C7GG 7GG7GGCTAT GGGGGCGGGA CGGAATAATG CGCGGAGTCC GACCGTTGAC 16BCC 

GGGGTATCGC CGCCAGAGGG CGCCGTAGCC CACCCTTTGG AGGAACT3CA 3AG3CTGGCG 16 86 0 

CGTGCTACGC CGGACCCGGC ACTCACCCG7 GGACC37TGC AGGTCCTGA3 C33CrTTCTr 16 520 

CGCGCAGGGT CAGACGGAGA CCGCGCCACT CACCACATGG C3CTCGAGGC TCCGG3AACC 16 980 

GTGCGTGGAG AAAGCCTAGA CCCGCCTG7T TCACAGAAGG GGCCAGCGCG CACAC3CCAC 1704 0 

AGGCCACCCC CCCTGCGACT GAG CTTCAAC CCC3TCAATG CC3ATGTACC C3CTACCT33 1~100 
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FIGURE 3A-10 

CGAGACGCCA CTAACGTGTA CTCGGGTGCT CCCTACTATG TGTGTGTTTA CGAACGCGGT 1*16 0 

GGCCGTCAGG AAGACGACTG GCTGCCGATA C CACTG AG CT TCCCAGAAGA GCCCGTGCCC 1722C 

CCGCCACCOO GCTTAGTGTT CATGGACGAC TTGTTCATTA ACACGAAGCA GTGCGACTTT :~28C 

GTGGACACGC TAGAGGCCGC CTGTCGCACG CAAGGCTACA CGTTGAGACA GCGCGTGCCT 17 340 

GTCGCCATTC CTCGCGACGC GGAAATCGCA GACGCAGTTA AATCGCACTT TTTAGAGG CG 174 0 0 

TGCCTAGTGT 7ACGGGGGCT GGCTTCGGAG GCTAGTGCCT GGATAAGAGC TGCCACGTCC 1-460 

CCGCCCCTTG GCCGCCACGC CTGCTGGATG GACGTGTTAG GA7TATGGGA AAGCCGCCCC 17 520 

CACACTCTAG GTTTGGAGTT' ACGCGGCGTA AACTGTGGCG GCACGGACGG TGACTGG7TA 1758 0 

GAGA7TTTAA AACAGCCCGA TGTGCAAAAG ACAGTCAGCG GGAGTCTTGT GGCA7GCGTG 17640 

ATCGTCACAC CCGCATTGGA AGCCTGGCTT GTGTTACCTG GGGGTTTTGC TATTAAAGCC 17 700 

CGCTATAGGG CGTCGAAGGA GGATCTGGTG TTCATTCGAG GCCGCTATGG CTAG C CGGAG 1776 0 

GCGCAAACTT CGGAATTTCC TAAACAAGGA ATGCATATGG ACTGTTAACC CAATG7 CAGG 17820 

GGA C CAT AT C AAGGTCTTTA ACGCCTGCAC CTCTATCTCG CCGGTGTATG ACCCTGAGCT 178 80 

GGTAACCAGC TACGCACTGA GCGTGCCTGC TTACAATGTG TCTG7GGCTA TCTTGCTGCA 17 94 0 

7AAAGTCATG GGACCGTGTG TGGCTGTGGG AATTAACGGA GAAATGATCA TGTACG7CGT IB 000 

AAGCCAGTGT GTTTCTGTGC GGCCCGTCCC GGGGCGCGAT GGTATGGCGC T C AT CT A CTT 18 060 

TGGACAGTTT CTGGAGGAAG CATCCGGACT GAGATTTGCC TACATTGCTC CGCCGCCGTC 1812 0 

GCGCGAACAC GTACCTGACC TGACCAGACA AGAATTAGTT CATACCTCCC AGGTGGTGCG 18180 

CCGCGGCGAC CTGACCAATT GCACTATGGG TCTCGAATTC AGGAATGTGA ACCCTTTTGT 182 4 0 

TTGGCTCGGG GGCGGATCGG TGTGGCTGCT GTTCTTGGGC GTGGACTACA TGGCGTTCTG 18 3 00 

TCCGGGTGTC G AC GGAATG C CGTCGTTGGC AAGAGTGGCC GCCCTGCTTA CCAGGTGCGA 18 36 0 

CCACCCAGAC TGTGTCCACT GCCATGGACT CCGTGGACAC GTTAATGTAT TTCGTGGGTA 18420 

CTGTTCTGCG CAGTCGCCGG GTCTATCTAA CATCTGTCCC TGTATCAAAT CATGTGGGAC 194 BO 

CGGGAATGGA GTGACTAGGG TCACTGGAAA CAGAAATTTT CTGGGTCTTC TGTTCGATCC 18 540 

CATTGTCCAG AGCAGGGTAA CAGCTCTGAA GAT AA CTAG C CACCCAACCC CCACGCACGT 186 00 

CGAGAATGTG CTAACAGGAG TG CT CG A CG A CGGCACCTTG GTGCCGTCCG TCCAAGGCAC 186 6 0 

CCTGGGTCCT CTTACGAATG TCTGACTACT TCAGCCGCTT GCTGATATAT GAGTGTAAAA 18 720 

AACTTAAGGC CCTGGGCTTA CGTTCTTATT G AAG CATGTT GCGCACATCA GCGAGCTGGA 18 780 

CCGTCCTCCG GGTCGCGTGT AGATTATGGT TCCGTTCTCC TTCTTGATGT TTAAATTTTT 18 84 C 

GGGGGGGAAC CACCGACAAA GCGTCTTTAT GATTTCCGCG AACACGGAGT TGGCTACGTG 19900 

CTTTTGGTGG GCTACGTACC CAA7GTTAAT GTTCTCTACG GATGCCAGTA GCATGCTGA? 18 960 

GATCGCCACC ACT AT C CATG TCTTTCCGTG TCTCCTTGGT ATTA3GAATA CGCTTGCCTT 19 = 10 

TTGCTTAAAC GTCTGTAAAA CACTGTTTGG AGTTTCAAAT AAACCGAAGT ACTGCTTAAA 19080 

CAATCCAAAC AACTGGTGCG TCTTTTGTGG GGCCTTGATT GAAACCAAAA AGAAAAAAGT 1914 0 

GTGCATTACT AGCTGCTGTT GGAAGGGCTC CAGCCAGTGC ACCCCGGGAA CGTAA CAGCC 19200 
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FIGURE 3A-11 

G77CAGAAAG GACGAAAGGT 7AACCAGAAA AGCCTGAAG7 7CGCGG7AGA CAGAGCAGGC 19260 

GTGCAGGGAG TCG7GTGTTT TTCTGCCCGC C7TGG7AC7CG ACCAG77GA7 CGGCCGTGGA 19320 

GACG7GCGCG TCCTCGCGCA CACACCGCA7 CTGCAAGTAT G77GA7AGGG AC7CCAA7AG 193 SO 

GCGCGGCTTT GCGGGGACGT 7G7CCTCGGA CGG7C7GGGG GTTCCCACGT CGGGA777GC 19440 

TGACGTGGGC GTGGCGGGAT GGTGCCGTGT GCAGTATGTT TCCAGGACCG AAC7G7A7GA 19500 

G777A77C7G TGCACCACGC CAATAAAAGG GTGCGCCATC CGTGCCGTTT TGGGACAGTG 19560 

TCGCGTGAA7 GTCGGGGCAC 7CAG7TCCCA CC7C7C7CCG GCGTC7TTGG CGG7C7CC7C 19620 

CAGG7TGGCG GCAAGGCGCT CCCTG7GACG GCTGAGCAGC ATGTT7GCTT 7GAGC7CGCT 19680 

CGTGTCCGAG GG7GACCCGG AGGTGACCAG 7AGG7ACG7C AAGGGCG7AC AACT7GCCC7 19 740 

GGACC77AGC GAGAACACAC C7GGACAA77 7AAG77GA7A GAAACTCCCC 7GAACAGC7T 198 00 

CC7CT7GG77 7CCAACG7GA 7GCCCGAGG7 CCAGCCAA7C 7GCAG7GGCC GOCCGGCCTT 

GCGGCCAGAC 777AG7AA7C 7CCAC77GCC 7AGAC7GGAG AAGC7CCAGA GAG7CC7CGG 

GCAGGG777C GGGGCGGCGG G7GAGGAAA7 CGCAC7GGAC CCG7C7CACG TAGAAACACA 19 980 

CGAAAAGGGC CAGG7G77C7 ACAACCAC7A 7GC7ACCGAG GAG7GGACG7 GGG C7T7GAC 2 004 0 

7C7GAA7AAG GA7GCGC7CC 77 CGGGAGG C 7G7AGA7GGC CTG7G7GACC CCGGAACT7G 2 0100 

GAAGGG7C77 C77CC7GACG ACCCCC77CC G77GC7A7GG C7GC7G77CA ACGGACCCGC 20160 

C7C77777G7 CGGGCCGAC7 G77GCC7G7A CAAGCAGCAC 7GCGG77ACC CGGGCCCGG7 2 022 0 

GC7ACTTCCA GG7CACA7G7 ACGC7CCCAA ACGGGA7C7T 77G7CG77CG 77AA7CA7GC 2 0280 

CC7GAAG7AC ACCAAG777C 7A7ACGGAGA 77777CCGGG ACA7GGGCGG CGGCT7GCCG 2 0340 

CCCGCCATTC GC7AC77CTC GGA7ACAAAG GG7AG7GAG7 CAGA7GAAAA TCA7AGA7GC 

77CCGACAC7 7ACA777CCC ACACC7GCCT C77G7GTCAC A7A7A7CAGC AAAA7AGCA7 

AA77GCGGG7 CAGGGG AC CC ACG7GGG7GG AA7 C C7AC7G 77GAG7GGAA AAGGGACCCA 2 0S20 

G7A7A7AACA GGCAA7G77C AGACCCAAAG G7G7CCAAC7 ACGGGCGAC7 ATCTAA7CAT 

CCCA7CG7A7 GACA7ACCGG CGA7CA7CAC CA7GA7CAAG GAGAA7GGAC 7CAACCAACT 

C7AAAAGAGA G777A7TAAG 7CGGC7C7GG AGG C - AA CAT CAACAGGAGG GC AG C7G7A7 2 07 00 



19860 
19920 



2040C 
20460 



20580 
2064C 



CGC7A77TGA 



2C71C 
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FIGURE 3B 

SEQ. ID. NO. 36 

GGATCCCTCT GACAACCTTC AGATAAAAAA CGTATATGCC CCCTTTTTTC AGTGGGACAG 
CAACACCCAG CTAGCAGTGC TACCCCCATT 7TTTAGCCGA AAGGATTCCA CCAT7GTGCT 
CGAATCCAAC GGATTTGACC CCGTGTTCCC CATGGTCGTG CCGCAGCAAC 7GGGGCACGC 
TATTCTGCAG CAGCTGTTGG TGTACCACAT CTACTCCAAA ATATCGGCCG GGGCZCCGGA 
TGATGTAAAT ATGGCGGAAC TTGATC7ATA TACCACCAAT GTGTCATTTA TGGGGCGCAC 
ATATCGTCTG GACGTAGACA ACACGGATCC 
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FIGURE 3C 

SEQ. ID. NO. 37 



GGATCCGCTG 


GCAGGTGGGC 


GCGCACCTCG 


TCGGG7AGCT 


TGGAGACAAA 


CAGCTCCAGG 


60 


CCAGTCCGCG 


CCGTAGCGCC 


TGCAGGTGCC 


TCACCACCGG 


GGCCGGGTCA 


TGCGATCTGT 


120 


TTAGTCCGGA 


GAAGATAGGG 


CCCTTGGGAA 


GCCGCTGAAC 


CAGC7CCAGG 


GTCTCCAAGA 


ISO 


TGCGCACCGG 


TTGTCGGAGC 


TGTCGCGATA 


GAGGTTAGGG 


TAGGTGTCCG 


GTCCGTCCGT 


240 


GGGCTCAAAC 


CTGCCCAGAC 


ACACCACTGT 


CTGCTGGGGG 


ATCATCCTTC 


TCAGGGAGAT 


300 


GCATT CTTTG 


GAAGTAGTGG 


TAGAGATGGA 


GCAGACTGCC 


AGGGCGTTGC 


AGGAGTGGTG 


360 


GCGATGGTGC 


gcaccg':": m : t 


TAAGAAACCC 


CC CAGGGTGG 


GGAC7CCCGC 


TCCCTGCAGC 


420 


ATCTCGGCCT 


GCTGTACGTC 


CTTGGCGAAT 


ATGCGACGAA 


ATCGGCTGTG 


CGCACGGGGT 


480 


CCCAGGGCCG 


GTCCTGGTGGC 


ATACAGGCCG 


GTGAGGGCCC 


CCTGGGTCTG 


TCCGCCTGGA 


540 


AACAGGGTGC 


TGTGAAACAA 


CAGGTTGCAA 


GGCCGCGAAT 


ACCCCTCTGC 


ACGCTGCTGT 


600 


GGACGTGGGT 


GTATGCTCCG 


TGGATCC 








627 
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FIGURE 3D 

SEQ. ID. NO. 38 

AGCCGAAAGG ATTCCACCAT TGTGCTCGAA TCCAACGGAT TTGACCCCGT GTTCCCCATG 6 0 

GTCGTGCCGC AGCAACTGGG GCACGC7ATT CTGCAGCAGC 7GTTGGTGTA CCACATCTAC 12 0 

TCCAAAATAT CGGCCGGGGC CCCGGATGAT GTAAATATGG CGGAACTTGA T CT AT AT AC C 180 

ACCAATGTGT CATTTATGGG GCGCACATAT CGTCTGGACG TAGACAACAC GGA 23 3 
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FIGURE 3E 

SEQ. ID. NO. 39 



GAAATTACCC 


ACGAGATCGC 


TTCCCTGCAC 


ACCGCACTTG 


GCTACTCATC 


A3TCATCG CC 


60 


CCGGCCCACG 


TGGCCGCCAT 


AACTACAGAC 


ATGGGAGTAC 


ATTG7CAGGA 


C CTCTTTATG 


120 


ATTTTCCCAG 


GGGACGCGTA 


TCAGGACCGC 


CAGCTGCATG 


A C TAT AT C AA 


AATGAAAGCG 


180 


GGCGTGCAAA 


CCGGCTCACC 


GGGAAACAGA 


A7GGATCACG 


TGGGATACAC 


TGCTGGx*«* i 


240 


CCTCGCTGCG 


AGAACCTGCC 


CGGTTTGAGT 


CATGGTCAGC 


TGGCAACCTG 


CGAGATAATT 


300 


CCCACGCCGG 


TCACATCTGA 


CGTTGCCT 








328 
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FIGURE 3F 

SEQ. ID. NO. 40 

AACACGTCAT GTGCAGGAGT GACATTGTGC CGCGGAGAAA CTCAGACC3C ATCCC3TAAC 6C 

CACACTGAGT GGGAAAATCT GCTGGCTATG 77TTCTGTGA TTATCTATGC CTTAGATCAC 12 0 

AACTGTCACC CG 132 
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Probe: KS330Bam KS627Bam 
Enzyme: Pvu I! 
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FIGURE 7 
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FIGURE 9 
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FIGURE 10 
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FIGURE 11 
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FIGURE 13 
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FIGURE 15A 
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FIGURE 15B 
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FIGURE 16A FIGURE 16B 
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FIGURE 17 
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FIGURE 19A FIGURE 19B 




FIGURE 19C FIGURE 19D 
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FIGURE 20A 
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FIGURE 20B 
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FIGURE 22 



PCR analysis of KS330233 in DN A samples from patients 
with Kaposi's sarcoma and tumor controls 



KS tissue: 

AIDS-KS 
Endemic KS 
Total 



No. tested 

24 
20 
44 



KS KS330233 
positive (%) 

22(92) 
17 (85) 
39 (89) 



Control Tumors: 

HIV seropositive 
HIV seronegative 
Total 



7 

15 
22 



1(14) 
2(13) 
3(14) 
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